Keyrock

Senior Data Engineer

Brussels, Abu Dhabi, Amsterdam, ...Full-timeGlobal

📊 Mid🏠 Remote

Job Description

[AI-summarized by JobStash]

You will build and operate streaming and batch data pipelines that ingest, normalise, and distribute market, trading, and portfolio data. You will design the lakehouse and time-series layers around consumer query patterns, own data contracts and schema evolution, and implement data quality, lineage, and self-healing. You will provide self-serve tooling, instrument observability, treat infrastructure as code, and work openly with architecture, infrastructure, platform, and product stakeholders. You will produce derived analytics such as cross-exchange spreads, VWAP, order book microstructure, and portfolio/performance views.

Requirements

●8+ years of building production data systems
●Strong proficiency in Python
●Strong proficiency in SQL and reasoning about query engines
●Strong understanding of data modelling for streaming and analytical workloads
●Experience designing and operating streaming systems (Kafka, Redpanda, MSK, or Kinesis)
●Experience with time-series stores in production (ClickHouse, TimescaleDB, QuestDB, or similar)
●Experience with lakehouse architectures and table layout, partitioning, and compaction decisions
●Experience building for idempotency and self-healing with safe reprocessing
●Experience with Docker, Terraform, and CI/CD
●Experience instrumenting logs, metrics, and traces for observability
●Experience designing data quality, governance, contracts, validation, lineage, and ownership
●Understanding of financial market data (order books, trades, reference data, portfolios, exposures)
●Ability to design, ship, operate, and improve end-to-end data systems
●Nice to have: Lakehouse experience with Apache Iceberg or Delta Lake
●Nice to have: Familiarity with DataHub or similar metadata/lineage platforms
●Nice to have: Rust familiarity

Responsibilities

●Build streaming and batch pipelines that ingest, normalise, and distribute market, trading, and portfolio data resilient to feed and exchange failures
●Build self-serve tooling (SDKs, patterns, templates, AI agents) for publishing and consuming data products
●Own data contracts and manage schema evolution
●Design the lakehouse and time-series layer around consumer query patterns
●Build and evolve data governance and data quality frameworks including stale-feed detection, schema validation, range checks, idempotent writes, lineage, and ownership
●Build derived analytics such as cross-exchange spreads, VWAP at depth, order book microstructure, portfolio views, exposure, and performance
●Make observability, cost, and performance first-class
●Treat infrastructure as code (Docker, Terraform, CI/CD)
●Write documentation and partner closely with Architecture, Infrastructure, Platform, and other teams

Benefits & Perks

●Flexible hours
●Remote-first
●Business-hours on-call shared across the team
●Regular online get-togethers
●Yearly onsite
●Autonomy on how you work
●Strong cross-functional partners

Tech Stack

DockerPythonSQLstreamingClickHouseRustmetricsportfolioVictoriaMetricstime series