Bullish
Lead Engineer, AI Platform
NEWLondonFull-timeGlobal
š Seniorš On-site
ActivePosted within the last 30 days
Job Description
[AI-summarized by JobStash]
You will design and implement production AI systems with an emphasis on reliability, observability, and continuous evaluation. You will lead development of natural-language interfaces to business data and architect multi-agent systems that coordinate across data sources. You will build evaluation harnesses and testing frameworks to measure AI quality before production deployment. You will translate complex requirements into scalable AI solutions, mentor engineers, establish coding standards, and partner with data engineering to define and enforce semantic models.
Requirements
- ā5+ years building production AI/ML systems with experience deploying LLM-based applications beyond proof-of-concept
- āHands-on experience with agent frameworks, tool-use patterns, and multi-step reasoning systems
- āExperience with at least three of: LangChain, LangGraph, LlamaIndex, multi-agent frameworks, Model Context Protocol, DSPy, vector databases, structured output libraries, LLM inference infrastructure, cloud AI platforms, or evaluation and observability tools
- āStrong background in data engineering, semantic modeling, or analytics infrastructure
- āProficiency in Python for AI/ML and cloud infrastructure (GCP preferred)
- āTrack record with CI/CD for ML, experiment tracking, and model governance
- āStrong communication and ability to present to senior stakeholders
Responsibilities
- āDesign and implement production AI systems with emphasis on reliability, observability, and continuous evaluation
- āLead development of natural-language interfaces to business data
- āArchitect multi-agent systems and agent orchestration
- āBuild evaluation harnesses and testing frameworks measuring groundedness and factual consistency
- āTranslate complex requirements into scalable AI solutions with clear success metrics
- āMentor engineers and establish coding standards
- āPartner with data engineering to define and enforce semantic models
Tech Stack
Semantic modelingevaluationAIVector databaseMLPythondata engineeringLLMmodel governanceagent orchestrationproject:CoinDesk