CuspAI is the frontier AI company on a mission to solve the breakthrough materials needed to power human progress. While nature took billions of years to perfect molecules, we are harnessing AI to unlock trillion-dollar materials breakthroughs in months, not millennia. Our founding team is the most cited in the world, comprised of world-class researchers in AI, chemistry and engineering.
We are working on some of the hardest and most important challenges including energy, clean water, the future of compute, and carbon capture, and this is just the start of what our 'search engine' for next-generation materials will unlock.
We invite you to be part of a diverse, innovative team at the intersection of AI and materials science, working to create impactful partnerships that drive innovation, scalability, and industry collaboration. This work matters. Your work matters.
We’re on the cusp of the on-demand materials era. Join us.
As we grow, we are seeking a Data Engineer to play a crucial part in driving our research and development efforts forward.
As a Data Engineer you will be part of the new team building the infrastructure that underpins and acts as the critical bridge between raw chemical data and our machine learning models.
Your main focus will be to build the pipeline infrastructure and tooling for data ingestion, moving towards self-serve setup for the scientific team members. You'll also be responsible for securing, collecting, cleaning, standardising, and tagging diverse chemical datasets to create high-quality training data for our ML researchers while working closely with our chemistry team to ensure scientific accuracy.
Data Pipeline Development
Design and build robust data pipelines for materials science datasets, experimental results, and computational chemistry outputs.
Develop processes to integrate diverse data sources including materials databases, literature, patent filings, and laboratory instruments.
Create automated workflows for processing crystallographic data, molecular structures, and materials properties (you don’t need to have direct domain experience - we can help bring you up to speed!).
Build scalable systems to handle high-throughput computational chemistry calculations and experimental data.
Data Quality & Standardisation
Partner closely with the scientific and research teams to implement automated quality checks for crystal structure data, chemical compositions, and experimental measurements.
Create standardisation protocols for materials nomenclature, units, and measurement conditions.
Build monitoring systems to ensure data integrity across all pipelines.
Collaboration & Integration
You will also be working hand in hand with ML researchers to understand data requirements for model training and inference.
Partner with materials scientists to ensure accurate representation of domain knowledge in data schemas.
Integrate with laboratory automation systems and computational chemistry software.
Support real-time data needs for AI-driven materials discovery experiments.
You are someone who gets excited about the opportunity to enable scientists to work on world changing challenges in this domain, with a personal interest in the potential applications of the technology that Cusp is building.
You’re a builder of tools and infrastructure who enjoys making life as easy as possible for the teams, providing self-serve, reliable and scalable ingestion pipelines.
You have at least 3+ years experience in data engineering roles, preferably in scientific or research environments - you would be joining as a data engineering subject matter expert who can not only work autonomously but also provide guidance on best practice.
High level of proficiency in Python and databases with experience in large-scale data processing - as part of our engineering team you’ll be programming regularly, not just scripting.
You’re an advanced user of workflow orchestration tools (e.g. Airflow, Prefect, Dagster, Flyte or similar).
Solid experience with containerisation (Docker, Kubernetes) and CI/CD practices.
You have direct experience handling large/complex datasets and are interested in working with scientific packages.
You’re a fast learner when it comes to new tools/systems.
You enjoy (and have experience in) designing systems that scale with growing data volumes and user demands.
Understanding and appreciation of DevOps practices is also important.
You’ve worked with data from scientific computing (simulations or experiments).
Knowledge of machine learning data requirements and MLOps practices, including pre-processing/processing as part of model training.
An academic background in Materials Science, Chemistry, Chemical Engineering, or related field.
Even more bonus points if you have an understanding of crystallography, materials properties, and computational chemistry concepts!
What we OfferA competitive salary: We value and reward impact and growth
Equity in CuspAI: You have a stake in the success of the company
Time off to stay fresh: 28 days holiday (DE, NL, UK) or 21 days holiday (JP, SG, US), in addition to local public holidays
‘Gold Standard’ parental leave: 26 weeks (primary caregiver) and 12 weeks (secondary caregiver) at full pay - we look after you and your family while we work on the most important materials discovery problems together
Professional development budget: We invest in your career development so you can stay up to date with the latest industry knowledge or add to your skills to increase impact and growth
Solve meaningful problems: See how your work has a direct impact on advancing materials science and solving sustainability and climate-related problems through the creation and application of bleeding-edge SOTA technology and revolutionary techniques
True interdisciplinary teamwork: Be part of a deeply collaborative environment bridging AI research, computational chemistry, and experimental science - work with world-class researchers and engineers who enjoy sharing knowledge and supporting each other
Join us in shaping the future of materials with AI. Together, we can create groundbreaking solutions for a more sustainable world.
CuspAI is an equal opportunities employer committed to building a diverse and inclusive workplace. We do not discriminate on the basis of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy or related condition (including breastfeeding), veteran status, or any other basis protected by applicable law.
We actively encourage applications from all backgrounds and value the unique perspectives and contributions that diversity brings to our team.
Please let us know If you require any specific adjustments during or after the interview process. We will do everything we can within reason to accommodate.
Skills Required
- 3+ years experience in data engineering roles
- High proficiency in Python and databases
- Experience with workflow orchestration tools
- Experience with containerisation and CI/CD practices
- Experience handling large/complex datasets
What We Do
CuspAI is an AI company developing a platform to accelerate the discovery and development of advanced materials, using AI models and simulations to identify and optimize novel materials for industrial and scientific applications.








