Skip to main content
NEUN
Back to Careers

QuickNode

Technical Operations Engineer

CanadaFull-timeGlobal
📊 Senior🏠 Remote

Job Description

[AI-summarized by JobStash]

You will ensure the stability, reliability, and performance of production systems. You will lead deployment and optimization of blockchain networks, troubleshoot complex Web3 incidents with log and JSON-RPC analysis, and coordinate with ecosystem partners. You will build and maintain monitoring and alerting solutions, define and enforce SLOs and SLAs, and implement automation using tools like Ansible, Terraform, and Kubernetes. You will collaborate with support, infrastructure, and development teams, and participate in a rotating 24/7 on-call schedule to address critical incidents and perform post-incident analysis.

Requirements

  • â—ŹMinimum of 5 years in Technical Operations Site Reliability Engineering or related roles
  • â—ŹProven Linux/Unix system administration and advanced troubleshooting capabilities
  • â—ŹDeep experience managing complex Web3 infrastructures including RPC services validator setups and node operations
  • â—ŹHands-on experience with Helm Terraform Ansible and Consul
  • â—ŹContainerization experience with Docker and Kubernetes
  • â—ŹCompetency in Python Go and JavaScript
  • â—ŹProficiency in monitoring and analytics platforms such as Grafana and DataDog
  • â—ŹExperience defining measuring and maintaining SLAs SLOs and using incident response tooling like PagerDuty
  • â—ŹAbility to perform benchmarking capacity and cost modeling and root cause analysis
  • â—ŹStrong interpersonal and communication skills

Responsibilities

  • â—ŹLead blockchain network deployments and optimization
  • â—ŹResolve complex Web3 incidents through troubleshooting and log analysis
  • â—ŹDevelop and maintain monitoring and alerting solutions using Grafana and DataDog
  • â—ŹDefine implement and enforce SLOs and SLAs
  • â—ŹImplement and maintain automation with Ansible Terraform and Kubernetes
  • â—ŹCollaborate with support infrastructure and development teams on system improvements
  • â—ŹParticipate in a rotating 24/7 on-call schedule to address critical incidents

Benefits & Perks

  • â—ŹQuarterly bonus tied to company and individual goal achievement

Tech Stack

LinuxnodeKubernetesDockerincident responsebenchmarkingvalidatorPythonmonitoring
Expired
Search