White Circle

Data Engineer

Posted 25 Days Ago

Be an Early Applicant

2 Locations

Hybrid

60K-80K Annually

Mid level

Artificial Intelligence • Security • Software • Cybersecurity

The Role

Design, build, and maintain large-scale data pipelines and tooling for petabyte-scale logs, events, and model traces. Ingest, clean, transform, version, and monitor datasets; ship analytics and dashboards; support production, testing, and research workloads; troubleshoot pipeline issues and improve reliability.

Summary Generated by Built In

TLDR: This is a broad, full-stack data engineering role. You'll design and run production data pipelines, model data so it's usable across teams, and build the warehouse and tooling that turn raw data into something people can actually work with. You'll be the second data engineer on the team, that means real ownership from day one, a direct line to how the company operates, and no shortage of problems worth solving.

About us

White Circle is an AI Safety company building the safety, reliability, and optimization layer for AI systems. At the core of our platform are policies – simple natural-language rules that define what an AI model should and shouldn’t do. We automatically test, enforce, and continuously improve these policies at scale.

We’ve raised $11M from top funds, founders, and senior leaders at OpenAI, Anthropic, HuggingFace, Mistral, DeepMind, Datadog, Sentry, and others
We process over one hundred million API calls every month
We fine-tune and train our own LLMs so they run faster and cheaper than any open or proprietary model

We’re a small, highly focused team. If you want to work deeply on hard problems, see your work ship to production quickly, and influence how AI safety is actually built – you’re the one we need.

What you’ll do

Design, build, and maintain production data pipelines - ingestion, transformation, and orchestration - that are reliable enough to be depended on.
Model and structure data in the warehouse so it's clean, documented, and genuinely useful to engineering, product, and research teams.
Own data infrastructure alongside the current data engineer: schema design, migrations, performance, and cost.
Provide data support to GTM engineers - deliver curated, enriched company and account data and the modeling layer that makes it query-ready.
Integrate and route diverse data sources into the warehouse, from internal product events to third-party enrichment feeds.
Occasionally build data-extraction jobs, including web scraping, when a source isn't otherwise available — one task among many, not the core of the role.
Improve data quality, observability, and documentation so the team can move quickly without breaking things.
Diagnose and fix pipeline issues before they become someone else’s problem.
Jump into data and infra tasks where needed and make things more robust.

What we’re looking for

Solid experience as a data engineer building and running production pipelines and data warehouses.
Strong SQL and Python, and comfort designing data models that other people build on.
Hands-on experience with a modern data stack: relational databases, a columnar/analytics store, orchestration, and transformation tooling.
You’ve worked with PostgreSQL (or similar) and understand how to structure and query data efficiently.
A production mindset - you care about reliability, migrations, performance, and cost, not just getting a query to return.
Able to work independently and own problems end-to-end in a fast-moving, early-stage environment.
You communicate clearly and can work in English.

Nice to have / Big plus

Experience with streaming/eventing (e.g., Kafka).
Experience versioning datasets and building lightweight data pipelines.
Experience with dbt for data modeling / analytics engineering, with modern ETL tools (Fivetran, Airbyte, dlt)
Experience with AWS (Athena, Glue, S3, etc.).
Familiarity with GTM/CRM data workflows and enrichment tools (Crunchbase, Clay).
Some web-scraping experience.
Comfort with a systems language (we use Rust for internal data tooling and SDKs).

Why White Circle

Join early, with a direct hand in the infrastructure that scales us.
Real ownership and a broad remit across platform, product, and research data.
A small, senior team that moves fast and ships.
Paid time off in line with your local regulations, no matter where you work from.
Work from Paris (hybrid) + relocation package.
Best medical insurance in France.
All the hardware, tools, and services you need.
Covered subscriptions for AI agents and IDEs
Team off-sites twice a year: we’ve recently been to the Alps and to Saint-Tropez.

How we hire

Intro call with one of our colleagues
Complete the take-home exercise
Show your best during the technical interview
Final call with our CEO and CTO

Please submit your application in English.

Skills Required

Strong in Python
Strong in SQL
Experience working with messy, real-world data
Solid experience with web scraping and building reliable, monitored pipelines
Experience with PostgreSQL or similar relational databases
Experience with cloud data warehouses (ClickHouse, BigQuery, Redshift, Snowflake, etc.)
Experience designing and building data pipelines for large-scale datasets
Ability to build data products, analytics, and dashboards used by stakeholders
Clear communication in English
Experience with Metabase or similar BI tools
Experience versioning datasets and building lightweight data pipelines
Experience with dbt for data modeling / analytics engineering
Familiarity with ETL tools (Fivetran, Airbyte, dltHub)
Experience with AWS (Athena, Glue, S3)
Experience with AI-assisted or agentic coding

View all jobs at White Circle

View White Circle Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

23 Employees

Year Founded: 2025

What We Do

White Circle is an enterprise AI control platform specializing in automated vulnerability detection and protection for AI systems. The company provides a unified system for testing, monitoring, and safeguarding AI applications in real time, focusing on blocking unsafe inputs, preventing jailbreaks, and optimizing model performance. Its mission is to secure AI systems and ensure they remain safe and controllable for businesses worldwide.