We are seeking an experienced Senior Data Engineer to own the pipeline-standardization and data-quality program for the enterprise lakehouse. This role ships compliance gates that block non-compliant deployments, stands up the data-quality framework, builds the dashboards business users trust, and drives measurable reductions in data incidents across the retail data estate.

Key Responsibilities:

Design, ship, and operate a pipeline-compliance checker that validates naming, metadata, config schema, DQ-rule declarations, and cluster-policy reference on every new deployment.
Deploy a data-quality framework (Great Expectations, Databricks DQ Rules, or equivalent) across new production pipelines; build a domain onboarding template; configure alert routing by severity.
Build and publish the Data Quality Dashboard — quality health by domain, source, table; near-real-time refresh; freshness, completeness, accuracy.
Establish Source Change Management agreements with key source systems (SLA contracts, change-request process, automated schema-change alerting); map source lineage end-to-end.
Lead the migration playbook to bring the legacy pipeline estate to standard; mentor engineers executing migration; own the playbook, not every migration.
Drive data-incident reduction through prevention (compliance gate, DQ framework, DCM, lineage), not reactive firefighting; lead incident response and post-mortems for major DQ failures.
Partner with platform engineering on Event Stream domain-event schemas and data-product contracts.
Author runbooks, code review at senior level, and contribute to engineering culture.

Requirements

Bachelor's degree in Computer Science, Data Engineering, or a related discipline.
5+ years designing, building, and operating production data pipelines on a major lakehouse or warehouse (Databricks, Snowflake, BigQuery).
Strong PySpark and SQL; understands Spark performance tuning at production scale.
Deep experience with data-quality frameworks (Great Expectations, dbt tests, Soda, Monte Carlo) — has defined SLAs, set thresholds, tuned alert noise.
Built and operated medallion / multi-layer lakehouse architectures with explicit transformation layers.
Solid Git / CI experience for data code; opinions on testing data transformations.
Comfortable defining and enforcing standards (naming, partitioning, retention, PII tagging) and reviewing PRs against them.
Cloud platform experience (Azure preferred; AWS / GCP transferable).

Preferred Qualifications

Streaming experience (Spark Structured Streaming, Delta Live Tables, Flink, Kafka Streams).
Data modeling discipline (Kimball, Data Vault 2.0) with clear rationale; Unity Catalog production experience (lineage, tags, RLS).

Senior Data Engineer (Pipelines & Data Quality) - Data Platform

Want to see the full job description?

Job details

Makro PRO