Chainlink
Senior Site Reliability Engineer, Node Platform
NEWUnited States; Argentina; Bogota; Ciudad de México; São Paulo; Toronto; Vancouver (Remote)Full-timeGlobal
📊 Mid🏠 Remote
RemoteRemote work position availableActivePosted within the last 30 days
Job Description
[AI-summarized by JobStash]
You will design and build Kubernetes-based infrastructure primitives and the CRE control plane that enable deterministic horizontal scaling of decentralized oracle networks. You will develop Kubernetes Operators, scaling automation, and reusable platform components, codify scaling logic, and implement safe, repeatable infrastructure expansion. You will improve operational efficiency, diagnosability, and the scalability of stateful distributed systems.
Requirements
- ●6–9+ years in SRE / Platform / Infrastructure Engineering
- ●Proven experience scaling Kubernetes in high-throughput production environments
- ●Deep knowledge of scheduler behavior
- ●Deep knowledge of StatefulSets & persistent workloads
- ●Deep knowledge of autoscaling strategies (HPA, VPA, KEDA, custom scaling)
- ●Resource management & performance tuning
- ●Multi-cluster and multi-region architectures
- ●Experience in diagnosing production failures at the cluster scale
- ●Strong Terraform or Crossplane experience
- ●GitOps workflows (ArgoCD / Flux) experience
- ●CI/CD reliability experience
- ●Automation-first mindset
- ●AWS production experience
- ●Proficiency in Go or equivalent systems language
Responsibilities
- ●Design infrastructure primitives for decentralized oracle networks
- ●Build Kubernetes-based control plane components
- ●Develop Kubernetes Operators and scaling automation
- ●Codify scaling logic into reusable operators and automation
- ●Ensure deterministic horizontal scaling of networks
- ●Implement safe and repeatable infrastructure expansion
- ●Improve operational efficiency and scalability
- ●Enhance diagnosability and observability for production systems
Tech Stack
operatorCI/CDAWSperformance tuningobservabilityVPAFluxdecentralized systemsArgoCDGitOps