Staff Data Engineer (Entity Resolution & Identity)

Posted 8 Days Ago
Be an Early Applicant
Rotterdam, NLD
Hybrid
Senior level
Artificial Intelligence • Legal Tech • Software
The Role
Design and own durable entity resolution and identity systems for large volumes of unstructured legal data. Build canonical entities, identity graphs with merge/split semantics and provenance, incremental recomputation, reversible decisions, matching and consensus logic, observability, and data quality systems to support explainable, maintainable AI-driven legal search and reasoning.
Summary Generated by Built In

We’re building a world-class team to redefine knowledge work with AI

Zeno is a legal AI startup building a platform that helps lawyers research, review, and draft documents with real legal reasoning — not just text prediction. We’re developing technology that can:

  • Search and retrieve statutes, case law, and commentary with high precision.

  • Reason step-by-step, applying legal tests and weighing precedents.

  • Explain every answer transparently, so lawyers can trace conclusions back to the exact sources.

Where most tools automate surface-level tasks, we’re focused on replicating the way lawyers actually think through legal problems, making depth and trust the foundation of everything we build.

You’re joining an early-stage startup that is already working with leading firms. Backed with €3M in seed funding, we’re now scaling a team of engineers and thinkers who want to solve real problems, drive innovation, and create lasting change in the legal sector.

About the role

As a Staff Data Engineer — Entity Resolution & Identity, you will build and own the core systems that power Zeno’s data and AI platform. Your work sits at the heart of the product: determining what is the same, what is different, what is a version, and how those decisions evolve over time.

This role is centered on hard engineering problems. You’ll work with large volumes of unstructured and semi-structured legal data from many sources, formats, jurisdictions, and time periods. You’ll design systems that can evolve without constant reprocessing, where every decision is explainable, reversible, and traceable.

You operate at senior-to-staff level and take ownership of long-lived, mission-critical infrastructure where correctness, performance, and maintainability matter deeply.

What you’re working on

What you’ll build

  • A durable entity resolution framework

  • Canonical entities with stable IDs that survive logic and data changes

  • Identity graphs with merge/split semantics and full provenance

  • Matching and consensus logic balancing precision, recall, and durability

  • Incremental recomputation (no “reprocess the world” when logic improves)

  • Reversible decisions: merge, split, revert, replay

  • System-level data quality, validation, and observability

You’ll build systems that can reliably answer:

  • Are these two records the same real-world legal entity?

  • Is this a duplicate, a variant, a new version, or something else?

  • How do we evolve matching logic without rebuilding everything?

  • How do we make merges reversible and decisions explainable?

Constraints you’ll work under:

  • Unstructured and semi-structured legal data

  • Conflicting, incomplete, and shifting sources of truth

  • Long-lived correctness requirements across jurisdictions

Who you are

  • Staff-level experience in data engineering, solving non-trivial identity, deduplication, or consistency problems

  • Strong system design instincts and ownership mindset

  • Experience building complex, long-lived production systems

  • Strong programming skills (for example Python or similar)

  • Hands-on work with unstructured or semi-structured data

You think naturally in terms of:

  • Entity resolution, record linkage, and deduplication

  • Blocking and candidate generation trade-offs

  • Precision/recall calibration

  • Survivorship and conflict resolution

  • G

  • raph connected components and merge cascades

  • Versioning, provenance, and replayability

Nice to have

  • Experience in legal, government, or other high-complexity document domains

  • Experience building human-in-the-loop review systems

Why this role
This is not a role focused on maintaining pipelines. You will design and build systems that do not yet exist as standard solutions in a domain where data identity, correctness, and evolution over time are exceptionally challenging.

The ride from startup to scale-up means things will break, and there won’t always be a playbook. You’ll wear multiple hats, ship fast, and learn faster. If you thrive on ownership, speed, and building from zero, you’ll love it here.

The ride from startup to scale-up

Things will break, priorities will shift, and there won’t always be a playbook. You’ll wear multiple hats, ship fast, and learn faster. Some weeks will feel chaotic, some problems will feel bigger than your role. That’s the nature of the ride from startup to scale-up: if you need stability and structure, this won’t fit. But if you thrive on ownership, speed, and building from zero, you’ll love it here.

Why join us

  • Be part of a product-driven team reinventing how legal professionals work.

  • Join early and shape the foundation of a fast-growing, high-impact startup.

  • Work in a place where hierarchy doesn’t matter — only the best ideas do.

  • Collaborate with a top-tier team of engineers, researchers, and entrepreneurs.

  • Competitive compensation, employee benefits and strong upside as we grow.

  • An inspiring place to work in the heart of Rotterdam.

Shape the future of legal work with us.

Skills Required

  • Staff-level experience in data engineering solving identity, deduplication, or consistency problems
  • Strong system design instincts and ownership mindset
  • Experience building complex, long-lived production systems
  • Strong programming skills (for example Python or similar)
  • Hands-on experience with unstructured or semi-structured data
  • Experience with entity resolution, record linkage, deduplication, blocking and candidate generation
  • Familiarity with precision/recall calibration, survivorship, conflict resolution, graphs and merge cascades, versioning, provenance, and replayability
  • Experience in legal, government, or other high-complexity document domains
  • Experience building human-in-the-loop review systems
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
0 Employees
Year Founded: 2023

What We Do

Zeno is a legal AI startup that develops an AI workspace designed to help lawyers research, review, and draft documents using real legal reasoning rather than simple text prediction. The company focuses on creating technology that can precisely retrieve statutes and case law and provide transparent, step-by-step explanations for every answer.

Similar Jobs

Deepgram Logo Deepgram

Research Staff, LLMs

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
In-Office or Remote
49 Locations
150 Employees
150K-250K Annually

Deepgram Logo Deepgram

Account Executive

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
In-Office or Remote
28 Locations
150 Employees

Mondelēz International Logo Mondelēz International

Director Planning Transformation

Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Remote or Hybrid
27 Locations
90000 Employees

Cloudflare Logo Cloudflare

Account Executive

Cloud • Information Technology • Security • Software • Cybersecurity
Remote or Hybrid
Netherlands
4400 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account