Maven Bio

Data Platform Engineer

Reposted 16 Days Ago

Boston, MA, USA

In-Office

Mid level

Artificial Intelligence • Machine Learning • Biotech • Business Intelligence

The Role

Build and operate LLM-augmented ETL pipelines and domain-specific information retrieval systems. Design, deploy, and optimize production-grade data ingestion, databases, and infrastructure on AWS. Own projects end-to-end, collaborate with leadership and customers, and refine data priorities based on user feedback.

Summary Generated by Built In

Maven Bio builds domain-specific AI for the BioPharma industry.

Our clients include publicly-traded BioPharma companies, venture capital firms, and global consultancies. We're based in Boston, the heart of the global BioPharma industry. As a Data Platform Engineer, you'll play a central role in enhancing our industry datasets that directly impact strategic decision-making in the biopharma industry.

What You’ll Do:

Collaborate directly with our CEO and technical leadership on strategic decisions and data strategy
Design, build, and optimize LLM-augmented ETL pipelines
Implement, experiment with, and optimize domain-specific information retrieval systems
Own projects end-to-end, from conception to deployment, ensuring robustness, scalability, and accuracy
Interface closely with our customers and internal teams to continuously refine data priorities based on user feedback.

What We’re Looking For:

Professional experience building production-grade data ingestion pipelines
Strong proficiency with Python and building robust APIs
Experience designing and integrating LLM-enabled ETL pipelines
Expertise with relational databases (PostgreSQL preferred) and infrastructure management on AWS (Kubernetes preferred)
Demonstrated ability to rapidly learn and deeply engage with complex industries (biopharma experience is a significant plus)
A builder mentality with side projects or professional experiences showcasing your ability to create innovative solutions
You're in Boston or you're willing to relocate, you want to work in-person and you are excited to work in a low-meeting environment

What We Offer:

Career Acceleration: Join a rapidly growing YC startup with sustained market traction that serves some of the top names in BioPharma and is backed by strong financial resources
Impact & Ownership: Directly influence product direction and technological decisions
Balanced Intensity: We aim for high productivity, focused work during the week (45-55 hrs) and value offline weekends
Cutting-edge Technology: Opportunity to work at the forefront of generative AI paired with a proprietary database of BioPharma knowledge

Our Team:

We are a ~10 person team that combines decades of experience from top-performing technology, biopharma, and consulting firms (McKinsey, Google, Airbnb, Valo Health, NeuTrace, Science.io)

Skills Required

Professional experience building production-grade data ingestion pipelines
Strong proficiency with Python and building robust APIs
Experience designing and integrating LLM-enabled ETL pipelines
Expertise with relational databases (PostgreSQL preferred)
Infrastructure management on AWS (Kubernetes preferred)
Demonstrated ability to rapidly learn and engage with complex industries (biopharma experience a plus)
Builder mentality with side projects or professional experiences showcasing innovation
Willingness to be in Boston, work in-person, and operate in a low-meeting environment

View all jobs at Maven Bio

View Maven Bio Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

Year Founded: 2023

What We Do

Maven Bio is an AI-native platform that transforms BioPharma knowledge work by pairing domain-specific AI with curated industry data. It helps consultants, investors, and corporate teams accelerate asset sourcing, due diligence, and competitive landscaping. The platform features modules for research tasks, indication landscapes, company/drug monitoring, and instant answers, allowing users to move from question to decision with less manual effort.