Risk Labs
Senior LLM Systems Engineer
š° USD 100,000 - 200,000/yr
Job Description
As a Senior LLM Systems Engineer, you will own the LLM driven components of the oracle automation stack and ensure accuracy, performance, resilience, and operational quality for model powered decisions. You will build evaluations, observability, tooling, fallbacks, and feedback loops to make LLM behavior measurable and dependable in real world conditions. You will improve prompts, model selection, tooling usage, structured outputs, retrieval, and evaluation coverage. You will design validation, retries, fallbacks, uncertainty handling, and human review paths for ambiguous inputs. You will build datasets, dashboards, traces, and review loops to surface model quality. You will enhance agent orchestration and tool use across internal services APIs search workflows databases and external data sources. You will debug live issues, investigate regressions, improve runbooks, and reduce operator friction. You will be measured by broader coverage, latency and cost improvements while preserving quality.
Requirements
- ā3+ years of professional software engineering experience in Python TypeScript or similar production languages.
- āHands-on experience building production systems that use LLMs agents retrieval structured outputs or model-powered workflows.
- āExperience designing evaluations test datasets regression checks quality metrics or manual review loops for AI systems.
- āStrong debugging ability across APIs databases queues logs model outputs and external data sources.
- āPractical understanding of prompt engineering tool calling structured output validation retrieval and common LLM failure modes.
- āAbility to reason carefully about correctness in uncertain or adversarial environments.
- āHigh agency strong ownership and clear written communication.
- āExperience with oracle systems prediction markets DeFi protocols or other crypto infrastructure.
- āExperience with UMA optimistic oracle mechanisms Polymarket or similar systems.
- āExperience building agentic systems that use tools search browser automation APIs or database queries.
- āExperience with LLM tracing model monitoring evaluation frameworks or AI observability tools.
- āExperience optimizing model cost and latency at scale.
- āExperience with Postgres data pipelines queue-based systems background jobs or event-driven architectures.
- āFamiliarity with blockchain operational constraints especially RPC limits indexing event logs finality and chain-specific behavior.
- āExperience with GCP Cloud Run GitHub Actions Terraform or similar infrastructure.
Responsibilities
- āOwn and improve the LLM driven components of the oracle automation stack.
- āImprove LLM accuracy by refining prompts model selection tool usage and retrieval.
- āImprove system performance by reducing latency token usage and cost while preserving decision quality.
- āEnhance resilience with validation retries fallbacks uncertainty handling and human review paths.
- āBuild evaluations datasets dashboards traces and review loops to make model quality visible.
- āImprove agent orchestration and tool use across internal services APIs search workflows databases and external data sources.
- āSupport production operations by debugging live issues improving runbooks and reducing operator friction.
Benefits & Perks
- āMeaningful long term equity participation.
- ā100% remote
- āFlexible vacation and family care
- āTraining and development
- āRemote work options
- āAt least two team wide offsites a year