As a Senior Data Engineer, you will play a key role in designing, building, and evolving AQEMIA’s data platform. You will work at the intersection of software engineering, data infrastructure, machine learning, and scientific research, enabling teams across the company to access trusted, scalable, and high-quality data.
You will contribute to the architecture of our data ecosystem, build reliable data pipelines, improve data accessibility and observability, and help establish best practices that support AQEMIA’s growing scientific and business needs.
Much of what makes this role distinctive is the data itself: chemical structures, molecular conformations, physics and ML-based predictions, and experimental results from CROs and partners. A core part of the work is modelling these scientific entities well, establishing canonical identity, provenance, and trustworthy lineage across heterogeneous and often messy sources so that scientists, and increasingly AI solutions and agents, can rely on them.
This is an opportunity to work on complex and impactful data challenges in a unique environment where physics, AI, and drug discovery converge.
Responsibilities
Design, build, and maintain scalable data pipelines supporting scientific, machine learning, and operational workloads.
Contribute to the evolution of AQEMIA’s data platform, architecture, and infrastructure.
Develop reliable systems for data ingestion, transformation, storage, and distribution across multiple data sources.
- Model canonical scientific entities (compounds, structures, assays, predictions), establishing identity, provenance, and lineage across heterogeneous experimental and in-silico sources.
Improve data quality, reliability, observability, and governance practices across the organization.
Implement monitoring, validation, lineage, and documentation standards to ensure trust in data assets.
Build self-service capabilities that enable scientists, engineers, and technical teams to discover and consume data efficiently.
Optimize performance, scalability, and cost-efficiency of data infrastructure running in cloud environments.
Collaborate closely with Software Engineering, AI, Product, and Scientific teams to translate complex requirements into scalable technical solutions.
Contribute to DataOps practices, including testing, deployment automation, operational excellence, and platform reliability.
Mentor and support other engineers through technical guidance, knowledge sharing, and engineering best practices.
Participate in architecture discussions, technical planning, and long-term platform evolution initiatives.
Qualifications
5+ years of experience in Data Engineering, Platform Engineering, or related infrastructure-focused roles.
Strong proficiency in Python and SQL, with experience building and maintaining production-grade data systems.
- Demonstrable interest in, or experience with, scientific data domains such as computational chemistry, bioinformatics, or life sciences. As AI tooling takes over more of the routine side of engineering, deep understanding of what the data means is increasingly what differentiates great work, so curiosity here matters as much as raw engineering speed.
Experience designing and operating scalable data pipelines and data platforms.
Strong understanding of data modeling, analytical warehouse design, and modern data architecture principles.
Experience with cloud environments, preferably AWS.
Hands-on experience with Snowflake, dbt, and a workflow orchestrator such as Airflow.
Familiarity with DataOps practices, including observability, testing, monitoring, and deployment automation.
Experience collaborating with cross-functional stakeholders to deliver reliable and scalable data solutions.
Strong problem-solving skills and ability to navigate complex technical environments.
Excellent communication skills and ability to explain technical concepts to diverse audiences.
Experience supporting machine learning, scientific computing, or research-oriented environments.
Experience with workflow orchestration platforms and event-driven data architectures.
Experience implementing data governance, lineage, and metadata management solutions.
Track record of improving platform scalability, reliability, and operational maturity.
Experience mentoring engineers or helping shape engineering best practices across teams.
Our Recruitment Process
- Talent Acquisition Interview (30 min)
Hiring Manager Interview with Jonathan, Data Manager (45 min)
Technical Assessment (60 min)
VP Interview with Sylvain, VP Engineering (45 min)
Culture Fit Interview with Emmanuelle Martiano, Co-founder & COO (45 min)
Final Interview with Maximilien Levesque, Co-founder & CEO (60 min)
Why Join Us?
Skills Required
- 5+ years of experience in Data Engineering, Platform Engineering, or related infrastructure-focused roles
- Strong proficiency in Python
- Strong proficiency in SQL
- Experience with cloud environments (preferably AWS)
- Hands-on experience with Snowflake, dbt, and a workflow orchestrator such as Airflow
- Experience designing and operating scalable data pipelines and data platforms
- Strong understanding of data modeling, analytical warehouse design, and modern data architecture principles
- Demonstrable interest in or experience with scientific data domains (computational chemistry, bioinformatics, life sciences)
- Familiarity with DataOps practices, including observability, testing, monitoring, and deployment automation
- Experience collaborating with cross-functional stakeholders to deliver reliable and scalable data solutions
- Strong problem-solving skills and excellent communication abilities
- Experience supporting machine learning, scientific computing, or research-oriented environments
- Experience implementing data governance, lineage, and metadata management solutions
- Experience with event-driven data architectures and additional workflow orchestration platforms
- Track record of improving platform scalability, reliability, and operational maturity
- Experience mentoring engineers or shaping engineering best practices
What We Do
AQEMIA is a next-gen pharmatech company generating one of the world's fastest-growing drug discovery pipeline. Our mission is to design fast innovative drug candidates for dozens of critical diseases, such as immuno-oncology. Our unique approach leverages quantum-inspired physics algorithms to power generative AI in designing novel drug candidates—without relying on experimental data. We already delivered several drug discovery successes within our internal pipeline and through collaborations with pharmaceutical companies. Our most advanced programs are currently in vivo optimization. We are growing and hiring! Check our career website: https://jobs.lever.co/aqemia.com Discover the roles and behind-the-scenes at AQEMIA on our Welcome To The Jungle page: https://www.welcometothejungle.com/fr/companies/aqemia






