Machine Learning Scientist — Agentic data pipelines

Posted Yesterday
Be an Early Applicant
Hiring Remotely in Office, Machaze, Manica, MOZ
Remote or Hybrid
148K-210K Annually
Mid level
Artificial Intelligence • Software • Biotech
The Role
Design and build automated systems for data acquisition, cleaning, and quality control for biomedical data. Collaborate with ML scientists and maintain data pipelines.
Summary Generated by Built In

Job Summary

We are seeking a scientist to join our team at Iambic Therapeutics, working on data acquisition and curation for Enchant, our multimodal transformer model trained at scale on a wide variety of biomedical data. In this role, you will design and build agentic systems that acquire, clean, format, and quality-control the large-scale datasets that power Enchant training. You will work at the intersection of LLM-based automation and biomedical data engineering—developing AI agents that can navigate heterogeneous data sources, enforce quality standards, and operate reliably at scale.

This role is ideal for candidates who combine strong software engineering instincts with scientific understanding of biomedical data, and who are excited about using LLMs as tools to solve practical data problems.

Key Responsibilities

  • Design, build, and maintain agentic systems for automated data acquisition from public and proprietary biomedical data sources

  • Develop LLM-based pipelines for data cleaning, normalization, and formatting across diverse data modalities (e.g., molecular, genomic, clinical, literature)

  • Implement automated quality-control workflows that detect anomalies, flag inconsistencies, and enforce data standards

  • Evaluate and iterate on agent architectures, prompting strategies, and tool-use patterns to improve reliability and throughput

  • Collaborate with ML scientists on the Enchant team to understand data requirements and translate them into scalable acquisition and processing systems

  • Monitor and maintain data pipelines in production, diagnosing failures and improving robustness over time

  • Document data provenance, processing decisions, and quality metrics to support reproducibility and auditing

Qualifications

Required:

  • Master's or PhD in a computational STEM field, or equivalent industry experience

  • Strong Python engineering skills, including experience building and maintaining production-quality software

  • Hands-on experience with LLM APIs (e.g., Claude, GPT) and agentic patterns such as tool use, orchestration, and multi-step reasoning

  • Familiarity with biomedical or chemical data sources and formats (e.g., PDB, UniProt, ChEMBL, SDF/MOL, FASTA, or similar)

  • Comfort with data engineering fundamentals: ETL design, data validation, and working with structured and unstructured data at scale

Desired:

  • Experience with agent orchestration frameworks

  • Familiarity with cloud infrastructure and workflow orchestration (e.g., AWS, Docker, Kubernetes)

  • Knowledge of multimodal biomedical data—spanning small molecules, proteins, assays, images, ‘omics, and/or clinical records

  • Experience with large-scale dataset construction or curation for ML model training

Location

Remote (US or UK). On-site available in Bristol, UK and Boston, US.

ABOUT IAMBIC THERAPEUTICS

Iambic is a clinical-stage life-science and technology company developing novel medicines using its AI-driven discovery and development platform. Based in San Diego and founded in 2020, Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters. The Iambic platform has demonstrated delivery of new drug candidates to human clinical trials with unprecedented speed and across multiple target classes and mechanisms of action. Iambic is advancing a pipeline of potential best-in-class and first-in-class clinical assets, both internally and in partnership, to address urgent unmet patient need. Learn more about the Iambic team, platform, pipeline, and partnerships at iambic.ai.

MISSION & CORE VALUES

Our mission is to deliver better medicines through innovations in AI-based discovery technologies. The culture and work at Iambic Therapeutics are profoundly strengthened by the diversity of our people and our differences in background, culture, national origin, religion, sexual orientation, and life experiences. We are committed to building an inclusive environment where a diverse group of talented humans work together to discover therapeutics and create technologies.

PAY AND BENEFITS

We offer industry leading competitive pay, company paid healthcare, flexible spending accounts, voluntary life insurance, 401K matching, and uncapped vacation to our team. We are in a brand-new state-of-the art facility in beautiful San Diego with an onsite gym, dining, and easy access to great places to live and play.

Skills Required

  • Master's or PhD in a computational STEM field, or equivalent industry experience
  • Strong Python engineering skills for production-quality software
  • Hands-on experience with LLM APIs
  • Familiarity with biomedical or chemical data sources
  • Comfort with ETL design and data validation
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Diego, California
104 Employees
Year Founded: 2019

What We Do

Iambic is a clinical-stage life-science and technology company developing novel medicines using its AI-driven discovery and development platform. Based in San Diego and founded in 2020, Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters. The Iambic platform has demonstrated delivery of new drug candidates to human clinical trials with unprecedented speed and across multiple target classes and mechanisms of action. Iambic is advancing a pipeline of potential best-in-class and first-in-class clinical assets, both internally and in partnership, to address urgent unmet patient need. Learn more about the Iambic team, platform, pipeline, and partnerships at iambic.ai

Similar Jobs

Suite Studios Logo Suite Studios

Senior Software Engineer

Cloud • Digital Media • Professional Services • Database
Remote
Office, Machaze, Manica, MOZ
20 Employees
150K-175K Annually

Suite Studios Logo Suite Studios

Intern - General Application

Cloud • Digital Media • Professional Services • Database
Remote or Hybrid
Office, Machaze, Manica, MOZ
20 Employees

Mondelēz International Logo Mondelēz International

Developer, Packaging Development (Fixed-Term)

Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Remote or Hybrid
4 Locations
90000 Employees

Mondelēz International Logo Mondelēz International

Manager Central FP&A MEU

Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Remote or Hybrid
5 Locations
90000 Employees
4K-4K Annually

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account