Checkbox%20technology logo

Principal Data Engineer

Checkbox%20technology

Posted about 13 hours ago

Principal Data Engineer

Full-time | Hybrid | Sydney

Reports to: VP of Engineering

 

About Us:

Checkbox is a Series A technology company building AI-native SaaS for in-house legal teams. Our platform helps legal teams manage how work is raised, understood, routed, actioned and resolved across the business.

 

We are now transforming our SaaS ecosystem into agentic-first products, using AI agents and intelligent workflow automation to change how legal work gets done. To support this, we are investing deeply in the data, context and retrieval foundations that power reliable AI product experiences.

 

This is a critical next chapter for Checkbox. Our data platform needs to become more than infrastructure. It needs to become a competitive moat.

 

The Role

We are looking for a Principal Data Engineer / Data Architect to own Checkbox’s data strategy and the architecture that delivers it.

 

This is a senior technical leadership role reporting directly to the VP of Engineering. You will lead our tier-one Data team, starting with one direct report, while remaining deeply hands-on in the design and delivery of the systems you define.

 

The Data team exists to serve the product engineering streams building AI agents, so close day-to-day partnership with Product, Engineering, AI and Platform teams is central to the role.

 

You will own the data and AI reference architecture that underpins our agentic product direction. This includes base data sources, intelligence services, context and retrieval layers, API and MCP surfaces, and the secure data access patterns required to give AI agents compliant, tenant-isolated context.

 

This is not a role for someone who only wants to design from a distance. We need someone who has built data and context foundations for AI products in production, can make pragmatic architecture decisions, and can lead by building.

 

What You’ll Own

Data and AI Reference Architecture

  • Own the data and AI reference architecture across Checkbox’s product ecosystem

  • Define how base data sources, intelligence services, APIs, MCP surfaces, context engines and retrieval layers fit together

  • Design patterns that allow product engineering streams to build AI features on shared foundations rather than reinventing data access each time

  • Establish clear architectural principles for storage, retrieval, serving, tenancy, observability and compliance

  • Work hand in glove with the Principal Engineer on architecture that spans application, platform, eventing, data and AI systems

 

Context, Retrieval and AI Data Foundations

  • Architect and build the context and retrieval layers that power AI agents and GenAI product experiences

  • Design secure, tenant-isolated data access patterns for AI systems

  • Define how customer, matter, workflow, document, event and operational data should be modelled, retrieved and served

  • Evaluate and select the right data patterns for each problem, including transactional stores, operational data stores, warehouses, vector stores, semantic layers, knowledge graphs, event streams and APIs

  • Make buy-vs-build decisions layer by layer rather than defaulting everything to in-house development

  • Ensure AI agents receive the right context at the right time, with appropriate controls around access, relevance, latency, cost and compliance

 

Data Platform, Integrations and Eventing

  • Own architecture across integrations, eventing, operational data, transactional data and AI-ready data

  • Define data contracts, event models, ingestion patterns and transformation approaches that scale across multiple products

  • Partner with product engineering teams to ensure new product capabilities generate usable, reliable and well-governed data

  • Build shared data capabilities that support analytics, AI agents, workflow automation and customer-facing product experiences

  • Improve data quality, lineage, observability and reliability across the data lifecycle

 

Security, Tenancy and Compliance

  • Treat security, tenancy and data isolation as first-order architectural concerns

  • Ensure data access patterns are compliant, auditable and appropriate for enterprise customers

  • Design systems that support tenant isolation, data segregation, least privilege access and secure retrieval

  • Partner with Platform, Security and Engineering teams on encryption, decryption, access control, auditability and production readiness

  • Ensure data and context systems can support customer trust, compliance requirements and future audit needs

 

Team Leadership and Technical Direction

  • Lead the Data team as a tier-one engineering function reporting to the VP of Engineering

  • Manage and grow one direct report initially, with scope to shape the team as the function expands

  • Set technical direction while staying close to implementation

  • Mentor engineers on data architecture, AI data foundations, retrieval patterns and pragmatic systems design

  • Build operating rhythms, standards and documentation that help the Data team scale

  • Act as the senior technical voice for data architecture across engineering leadership discussions

 

What Success Looks Like

  • Product engineering streams can build AI features on a dependable, shared data and context layer

  • The data and AI reference architecture is real, documented, actively used and continuously improved

  • Storage, retrieval and serving choices genuinely fit the problem rather than forcing every use case through one pattern

  • Data access is compliant, secure and tenant-isolated by default

  • AI agents can access relevant context reliably, with clear controls around quality, latency, cost and permissions

  • Data contracts, event models and retrieval patterns reduce duplication across product teams

  • The Data team becomes a strategic enabler for AI product development rather than a bottleneck

  • Checkbox’s data and context capability becomes visibly stronger as a competitive moat

 

About You

  • Significant experience as a senior data engineer, principal data engineer, data architect, staff engineer or similar technical leadership role

  • Proven experience building data and context foundations that power AI products in production

  • Strong experience designing data architectures across transactional, operational, analytical and AI-ready systems

  • Deep understanding of modern data platform patterns, including data contracts, event-driven architecture, ingestion, transformation, observability, lineage and governance

  • Strong understanding of AI data patterns such as retrieval systems, embeddings, semantic modelling, vector search, knowledge graphs, context engineering or agentic workflows

  • Experience working with multi-tenant SaaS systems where data segregation, tenancy and access control matter

  • Ability to make pragmatic architecture decisions and select the right tool or pattern for each problem

  • Strong judgement around buy-vs-build decisions across data infrastructure, retrieval, orchestration and AI platform layers

  • Comfortable leading technical direction while still building, reviewing and shipping

  • Experience mentoring engineers or leading small technical teams

  • Strong communication skills and ability to work closely with product engineering streams, platform teams and senior engineering leadership

  • Comfortable operating in a fast-moving environment where systems are being built, scaled and refined at the same time

 

Bonus Points

  • Experience working on AI agents, agentic workflows, GenAI platforms or AI-native SaaS products

  • Experience designing MCP or API surfaces for data and context access

  • Experience with legal tech, workflow automation, enterprise SaaS or document-heavy products

  • Experience with AWS-based data and platform infrastructure

  • Experience with event-driven systems, queues, Pub/Sub patterns or streaming architectures

  • Experience with data security, compliance, auditability and enterprise customer requirements

  • Experience growing a small data function from early foundations into a scalable team

 

What We Offer

  • Competitive salary

  • Hybrid working with team days in our Sydney CBD office

  • Direct reporting line to the VP of Engineering

Want to see the full job description?

Sign in to view the complete details and apply to this position.

Job details

Workplace

Hybrid

Location

Sydney

Experience

SE

Similar
Checkbox%20technology logo

Checkbox%20technology

Jobr Assistant extension

Get the extension →