Senior Data Engineer - AI Focused (x/f/m)

Posted 2 Days Ago
Easy Apply
Be an Early Applicant
Paris, Île-de-France
In-Office
Senior level
Healthtech • Software
The Role
The Senior Data Engineer will build and maintain data pipelines on GCP for AI models, ensuring data quality and collaborating with teams on data governance.
Summary Generated by Built In
What you’ll do

At Doctolib, we're on a mission to transform healthcare through the power of AI. As a Senior Data Engineer, you'll play a key role in building and optimizing the data foundations within the AI Team to deliver safe, scalable, and impactful models.

You will join a dedicated team working on data infrastructure for LLM, VLM and RAG-based systems, powering our new AI Medical Companion.

Your work will ensure that our engineers and data scientists can train, evaluate, and deploy AI models efficiently on high-quality, well-structured, and compliant data.

Your responsibilities include but are not limited to:

  • Ensure high standards of data quality for AI model inputs.
  • Design, build, and maintain scalable data pipelines on Google Cloud Platform (GCP) for AI and machine learning use cases.
  • Implement data ingestion and transformation frameworks that power Retrieval systems and training datasets for LLMs and multimodal models.
  • Architect and manage NoSQL and Vector Databases to store and retrieve embeddings, documents, and model inputs efficiently.
  • Collaborate with ML and platform teams to define data schemas, partitioning strategies, and governance rules that ensure privacy, scalability, and reliability.
  • Integrate unstructured and structured data sources (text, speech,image, documents, metadata) into unified data models ready for AI consumption.
  • Optimize performance and cost of data pipelines using GCP native services (BigQuery, Dataflow, Pub/Sub, Cloud Storage, Vertex AI).
  • Contribute to data quality and lineage frameworks, ensuring AI models are trained on validated, auditable, and compliant datasets.
  • Continuously evaluate and improve our data stack to accelerate AI experimentation and deployment.
Who you are

You could be our next teammate if you have:

  • Master’s or Ph.D. degree in Computer Science, Data Engineering, or a related field.
  • 5+ years of experience in Data Engineering, ideally supporting AI or ML workloads.
  • Strong experience with the GCP data ecosystem
  • Proficiency in Python and SQL, with experience in data pipeline orchestration (e.g., Airflow, Dagster, Cloud Composer).
  • Deep understanding of NoSQL systems (e.g.,  MongoDB) and vector databases (e.g.,  FAISS, Vector Search).
  • Experience designing data architectures for RAG, embeddings, or model training pipelines.
  • Knowledge of data governance, security, and compliance for sensitive or regulated data.
  • Familiarity with W&B / MLflow / Braintrust / DVC for experiment tracking and dataset versioning (extract snapshots, change tracking, reproducibility).
  • Familiarity with  (Docker, Kubernetes) and CI/CD for data workflows.containerized environments
  • A collaborative mindset and passion for building the data foundations of next-generation AI systems.
What we offer
  • Free comprehensive health insurance for you and your children
  • Parent Care Program: additional leave on top of the legal parental leave
  • Free mental health and coaching services through our partner Moka.care
  • For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
  • Work from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policy
  • Work Council subsidy to refund part of a sport club membership or a creative class
  • Up to 14 days of RTT
  • Lunch voucher with Swile card
The interview process
  • Recruiter Interview
  • Feature Building Interview
  • System Design Interview
  • Behavioral Interview
  • At least one reference check
Job details
  • Permanent position
  • Full-time
  • Paris, France
  • Hyrbid mode: 2 days remote per week
  • Start date: as soon as possible

If you would like to find out more about tech life at Doctolib, feel free to read our latest Medium blog articles!

At Doctolib, we are committed to improving access to healthcare for everyone. This translates into our recruitment process. We evaluate candidates based solely on qualifications and motivation, without any form of discrimination.

The more diverse ideas are heard, the more our product will truly improve healthcare for all. You are welcome to apply to Doctolib, regardless of your gender, religion, age, sexual orientation, ethnicity, disability.

To ensure equal opportunities, we invite you to exclude personal information (e.g. pictures, age) from your applications. If you require any accommodation, please let us know for support during the hiring process.

Join us in building the healthcare we all dream of!

All information provided is processed by Doctolib for application management. For data processing details, click here. Please contact hr.dataprivacy(at)doctolib.com for inquiries or to exercise your rights.

Top Skills

Airflow
Braintrust
Ci/Cd
Cloud Composer
Dagster
Docker
Dvc
Faiss
GCP
Google Cloud Platform
Kubernetes
Mlflow
MongoDB
NoSQL
Python
SQL
W&B
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Île-de-France
3,117 Employees

What We Do

Since Doctolib's creation in 2013, we have had one purpose: strive for a healthier world.

1. We aim to improve the daily lives of care teams by providing them with a new generation of technologies and services.

2. We aim to improve health for all, by offering a fast and frictionless journey for all care episodes, creating new ways for people to receive care and empowering them to become actors of their health.

At Doctolib, we are honored to work in the healthcare field and we believe that innovation in healthcare should be handled differently. We apply 4 guiding principles in everything we do:

1. We create helpful solutions for care teams and people.
2. We serve everyone equally and create well-designed and accessible technologies.
3. We team up with our users to strive for a healthier world and act as one team.
4. We protect our users' privacy. It’s their health, their data.

To achieve our purpose, we are assembling a team dedicated to improving healthcare, with a human-centric approach and an entrepreneurial mindset.

www.doctolib.com

Similar Jobs

ServiceNow Logo ServiceNow

Enterprise Account Executive

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Issy-les-Moulineaux, Hauts-de-Seine, Île-de-France, FRA
28000 Employees

Cloudflare Logo Cloudflare

Account Executive

Cloud • Information Technology • Security • Software • Cybersecurity
Hybrid
Paris, Île-de-France, FRA
4400 Employees
Hybrid
Paris, Île-de-France, FRA
289097 Employees

Datadog Logo Datadog

Scientist

Artificial Intelligence • Cloud • Security • Software • Cybersecurity
Easy Apply
Hybrid
Paris, Île-de-France, FRA
6500 Employees

Similar Companies Hiring

Granted Thumbnail
Mobile • Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account