ML Engineer, Open Source

Posted 23 Days Ago
Be an Early Applicant
2 Locations
In-Office
Mid level
Artificial Intelligence • Information Technology
The Role
As an ML Engineer, you will design sklearn-compatible APIs, manage PyTorch serialization, build preprocessing pipelines, and enhance developer experience in the tabular model ecosystem.
Summary Generated by Built In
Who we are

Foundation models have transformed text and images, but structured data - the largest and most consequential data modality in the world - has remained untouched. Tables power every clinical trial, every financial model, every scientific experiment, every business decision. No one has built a foundation model that truly understands them.

Until now. What LLMs did for language, we're doing for tables. The next modality shift in AI is happening - and we're hiring the team that makes it.

Momentum: We pioneered tabular foundation models and are now the world-leading organization in structured data ML. Our TabPFN v2 model was published in Nature and set a new state-of-the-art for tabular machine learning. Since its release, we've scaled model capabilities more than 20x, reached 3M+ downloads, 6,000+ GitHub stars, and are seeing accelerating adoption across research and industry - from detecting lung disease with Oxford Cancer Analytics to preventing train failures with Hitachi to improving clinical trial decisions with BostonGene.

The hardest work is in front of us. We're scaling tabular foundation models to handle millions of rows, thousands of features, real-time inference, and entirely new data modalities - while building the infrastructure to deploy them in production across some of the most demanding industries on earth. These are open problems no one else is working on at this level.

Our team: We’re a small, highly selective team of 20+ engineers, researchers and GTM specialists, selected from over 5,000 applicants, with backgrounds spanning Google, Apple, Amazon, Microsoft, G-Research, Jane Street, Goldman Sachs, and CERN, led by Frank Hutter, Noah Hollmann and Sauraj Gambhir and advised by world-leading AI researchers such as Bernhard Schölkopf and Turing Award winner Yann LeCun. We ship fast, create top-tier research, and hold each other to an extremely high bar.

What’s Next: In 2025, we raised €9m pre-seed led by Balderton Capital, backed by leaders from Hugging Face, DeepMind, and Black Forest Labs. The next phase of growth is here which makes this an optimal time to join.

About the role

Most companies treat open source as a side job for researchers who'd rather be doing something else. We think that's wrong. Prior Labs is rooted in open source — TabPFN started as a research project the community adopted, and that's how we became a company.

Language models and image models have had years to build out their ecosystem interfaces and integrations. For tabular foundation models, none of that exists yet. You're not plugging into existing patterns — you're creating them. The engineering is genuinely hard: TabPFN does in-context learning, not traditional fit/predict, so wrapping it behind a clean sklearn interface means solving problems no other library has solved. You're designing APIs for a model whose architecture evolves faster than users can upgrade, and making inference robust to the full chaos of real-world tabular data. You understand the model deeply enough to push back when something will break downstream, and you care enough about the details to write great docs and error messages on top of great code.

What you'll work on:

  • Design sklearn-compatible APIs around a foundation model that doesn't behave like a traditional estimator — solve the hard abstraction problems so the interface feels simple

  • Build and maintain PyTorch serialization, HuggingFace Hub model distribution, and checkpoint management across a multi-model, multi-version ecosystem

  • Build MCP and tool-use wrappers for agentic AI pipelines

  • Model-adjacent ML engineering: preprocessing pipelines, inference wrappers, dtype handling, edge case hardening against real-world data

  • Own releases, CI, testing, and docs across the TabPFN ecosystem — TabPFN (core), tabpfn-client, tabpfn-extensions, tabpfn-time-series

  • General ML engineering: benchmarking, evaluation pipelines, data loading, tooling that makes the team faster

You may be a good fit if you have:

  • 3+ years building and maintaining Python packages or ML libraries used by others (open source track record strongly preferred)

  • Deep fluency in PyTorch, scikit-learn, pandas, NumPy — their internals, extension points, and failure modes, not just their APIs

  • Strong software engineering: testing, CI/CD, packaging (pyproject.toml, uv), semantic versioning, multi-version Python support

  • Comfortable reading and working with model code — forward passes, checkpoint loading, inference optimization — and forming opinions about it

  • Solid ML fundamentals: enough to write correct preprocessing, catch data leakage, and push back on design choices that break downstream

  • Genuine care about developer experience: you write great docs and great error messages because you think they're engineering, not chores

Bonus:

  • Maintainer or significant contributor to a popular open source ML/data library

  • Strong AI tooling skills — you use Claude Code, Cursor, or similar fluently to move fast

  • MCP server or tool-use integration experience

  • HuggingFace Hub model distribution experience

  • Background in tabular data, AutoML, or time series

  • Experience debugging cross-platform packaging, or contributing to PyTorch/sklearn core

Location
  • Offices in Freiburg, Berlin, San Francisco and NYC with flexibility to work across our locations

Compensation & Benefits
  • Competitive compensation package with meaningful equity (We compete with the world's biggest AI companies for talent)

  • Work with state-of-the-art ML architecture, substantial compute resources, and a world-class team

  • Annual company-wide offsites to bring the team together (last trip was to the Alps 🏔️)

  • 30 days of paid vacation + public holidays

  • Comprehensive benefits including healthcare, transportation, and fitness

  • Support with relocation where needed

Our Commitments
  • We believe the best products and teams come from a wide range of perspectives, experiences, and backgrounds. That’s why we welcome applications from people of all identities and walks of life, especially anyone who’s ever felt discouraged by "not checking every box."

  • We’re committed to creating a safe, inclusive environment and providing equal opportunities regardless of gender, sexual orientation, origin, disabilities, or any other traits that make you who you are.

Top Skills

Huggingface Hub
Numpy
Pandas
Python
PyTorch
Scikit-Learn
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Berlin
11 Employees
Year Founded: 2024

What We Do

Prior Labs is building breakthrough foundation models that understand spreadsheets and databases - the lifeblood of science and business. While foundation models have transformed text and images, tabular data has remained largely untouched. We're tackling this opportunity to revolutionize how we approach scientific discovery, medical research, financial modeling, and business intelligence. Backed by Balderton Capital, XTX Ventures, SAP Founder Hans Werner-Hector's Hector Foundation, Atlantic Labs, Galion.exe and top AI leaders such as Peter Sarlin, Guy Podjarny, Thomas Wolf, Ed Grefenstette, Robin Rombach, Christopher Lynch and Ash Kulkarni.

Similar Jobs

HERE Technologies Logo HERE Technologies

Lead Software Engineer

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Hybrid
Berlin, DEU
6000 Employees

HERE Technologies Logo HERE Technologies

Designer

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Hybrid
Berlin, DEU
6000 Employees

HERE Technologies Logo HERE Technologies

Consultant

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Hybrid
Berlin, DEU
6000 Employees

Perk Logo Perk

Senior Sales Executive

Artificial Intelligence • Fintech • Greentech • Sales • Software • Travel • Hospitality
Hybrid
Berlin, DEU
1800 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account