Cortea AI

Senior Data Engineer (Search) (m/f/x)

Reposted 11 Days Ago

Be an Early Applicant

Berlin, DEU

In-Office

80K-120K Annually

Senior level

Artificial Intelligence • Information Technology • Software

The Role

Design and implement search retrieval and ranking systems, build data pipelines, and evaluate performance metrics. Collaborate with teams to enhance system relevance.

Summary Generated by Built In

About us

We’re Cortea, a Berlin startup transforming audits with AI. Manual, document-heavy audits waste expert time while demand keeps rising. Our AI-powered software and specialized AI agents remove the repetitive work so auditors can focus on judgment.

Backed by top-tier VCs with >10m funding, with a working product and paying customers, we’re rapidly scaling.

We value first-principles thinking, speed, trust, and kindness. We build side by side in our Berlin office.

Your Role

You'll own the data and quality infrastructure that makes Cortea's AI pipeline trustworthy and continuously improving. Pipeline data, evals, observability, ground truth, retrieval quality.

You're a builder. You write production Python. You think in pipelines and feedback loops, not notebooks. You've worked with LLM outputs in production and have strong opinions about how to make probabilistic systems measurable.

What you’ll do

Build and operate the data pipelines behind Cortea's AI. Every model call, every pipeline state, every customer document, captured, queryable, observable
Create the foundation for evaluating agent performance and quality. Make probabilistic quality measurable, regression-detectable, and reproducible across model versions
Maintain observability of agent cost and optimizations
Improve document extraction and retrieval quality on the documents that matter most (financial statements, audit reports, complex tables)
Maintain the Data FBigQuery foundation engineers, PMs, and founders use to make decisions
Partner with engineering and product to turn customer feedback into measurable, shipped improvements

Success at 6 months

Eval framework live across our core pipelines — every ship is measured before it goes out
Cost and quality observability on every pipeline run, alerting that catches regressions early
Document extraction and retrieval quality measurably better on the documents customers care about most
Trusted by engineers and founders to own the data foundation end-to-end

Qualifications

4+ years total, 3+ shipping production data infrastructure (pipelines, warehouses, observability)
Strong Python and SQL. Reads code to understand data, doesn't just trust schemas
Has worked with LLM outputs in production. Has built or seriously used an eval framework
Comfortable with cloud data warehouses (BigQuery preferred, Snowflake/Redshift fine), distributed processing, batch and streaming
Cares about outcomes over process, clarity over frameworks
Comfortable with startup environment high autonomy, high ambiguity, high speed

Bonus

Built or seriously contributed to retrieval/RAG, document extraction, or OCR systems
GCP / BigQuery / Temporal experience
Background in audit, compliance, legal, or another document-heavy professional services domain
Speaks German

No one checks every box. If you’ve shipped retrieval systems and like owning evaluations and pipelines, let’s talk.

What we offer

Attractive compensation: competitive salary plus significant equity
High impact & growth: Shape AI at a scaling startup
Personal development: Learning budget for courses and conferences
Startup perks: Flexible vacation, team lunches, retreats, central Berlin office

Interview process

First Call — Intro to Cortea with our Founders Associate Leon
Second Call — Technical interview with with a member of our technical staff
Third Call — Deep dive into our culture with our Co-Founder Philipp
On-site Day (Berlin) — Meet the team and work on a real problem together

We’re an equal-opportunity team and encourage women and underrepresented groups to apply.

Skills Required

5+ years as a software engineer or applied scientist with hands-on information retrieval work
Data engineering experience
Strong Python skills (asyncio, type hints, pandas)
Strong SQL skills (modeling, window functions, CTEs)
Knowledge of IR fundamentals: BM25, TF-IDF, neural retrieval
Experience with NDCG, MRR, Precision@K

View all jobs at Cortea AI

View Cortea AI Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.