Skip to main content
NEUN
Back to Careers

Fortytwo

Senior MLOps Engineer

NEW
RemoteFull-timeGlobal
šŸ“Š MidšŸ  Remote
RemoteRemote work position availableActivePosted within the last 30 days

Job Description

[AI-summarized by JobStash]

You will deploy and maintain production ML infrastructure, optimize GPU utilization, and serve large and small language models. You will build CI/CD pipelines, create Helm templates for Kubernetes deployments, implement model optimization and serving workflows, and set up monitoring, logging, and automated workflows to ensure reliable model delivery.

Requirements

  • ā—Bachelor's or Master's degree in Computer Science Engineering or related field
  • ā—Proficiency in Kubernetes Helm and containerization technologies
  • ā—Experience with GPU optimization including MIG and NOS
  • ā—Experience with cloud platforms such as AWS GCP and Azure
  • ā—Knowledge of monitoring tools such as Grafana and Prometheus
  • ā—Proficiency in scripting languages Python and Bash
  • ā—Hands-on experience with CI/CD tools and workflow management systems
  • ā—Familiarity with Triton Inference Server ONNX and TensorRT

Responsibilities

  • ā—Deploy scalable production-ready ML services with optimized infrastructure
  • ā—Manage and autoscale Kubernetes clusters
  • ā—Optimize GPU resources using MIG and NOS
  • ā—Manage cloud storage to ensure high availability and performance
  • ā—Integrate LoRA and model merging workflows
  • ā—Adapt and deploy state-of-the-art ML codebases
  • ā—Deploy and manage LLMs SLMs and LMMs
  • ā—Serve models using Triton Inference Server and other serving frameworks
  • ā—Leverage vLLM and TGI for model serving
  • ā—Optimize models with ONNX and TensorRT
  • ā—Develop Retrieval-Augmented Generation systems
  • ā—Set up monitoring and logging with Grafana Prometheus Loki Elasticsearch and OpenSearch
  • ā—Write and maintain CI/CD pipelines using GitHub Actions
  • ā—Create Helm templates for rapid Kubernetes node deployment
  • ā—Automate workflows using cron jobs and Airflow DAGs

Tech Stack

NOSGitHub ActionsONNXfine-tuningmonitoringGPULokiS3model servingLMM
Expired
Search