Skip to main content
NEUN
Back to Careers

Manifold Labs (Targon)

Site Reliability Engineer

NEW
RemoteFull-timeGlobal
RemoteRemote work position availableActivePosted within the last 30 days

Job Description

[AI-summarized by JobStash]

You will ensure services stay online and performant around the clock. You will optimize Kubernetes clusters including service mesh, metrics, and logging. You will benchmark services and identify infrastructure bottlenecks. You will improve observability and alerting to catch issues before they impact users, scale services to minimize downtime under load, and develop CI/CD pipelines for new and existing services.

Requirements

  • Hands-on experience with Kubernetes in production environments
  • Proficiency with Golang for systems and infrastructure tooling
  • Familiarity with confidential virtual machines (CVMs)
  • Experience with Prometheus, Loki, and Grafana for monitoring and observability

Responsibilities

  • Ensure services stay online and performant, including during off hours
  • Optimize Kubernetes clusters, including service mesh, metrics, and logging
  • Benchmark services and identify infrastructure bottlenecks
  • Improve observability and alerting systems to catch issues before they impact users
  • Scale services to minimize downtime under load
  • Develop CI/CD pipelines for new and existing services

Tech Stack

Lokisystems engineeringCDbenchmarkingCIGitHubservice meshKubernetesmetricLinear
Expired
Search