Machine Learning Engineer

Reposted 7 Days Ago
Hiring Remotely in Oakland, CA, USA
In-Office or Remote
195K-250K Annually
Senior level
Artificial Intelligence • Conversational AI
The Role
As an ML research engineer, you will develop systems using language models, curate datasets, set evaluation metrics, and scale semantic search processes.
Summary Generated by Built In
About Elicit

Elicit is building the reasoning layer for science and decision-making. We use language models to search over 125 million papers, extract data, and surface insights so that researchers, policy-makers, and industry leaders can go from questions to evidence-backed decisions in minutes.

Today, hundreds of thousands of researchers have used Elicit to speed up literature reviews, automate systematic reviews, and explore new domains. As we expand our impact beyond academic research, we are laying the groundwork for ML systems that are systematic, transparent, and unbounded when reasoning at scale.

To do this, Elicit is pioneering supervision of process, not outcomes. Instead of favoring large black-box models, we break complex questions down into human-legible steps and supervise the reasoning process itself. This approach delivers more transparent, defensible answers today and charts a safer path toward advanced AI tomorrow.

Our vision is ambitious: we’re building the default starting point for understanding and reasoning through any hard question. We invite you to help us build that future.

(See how people use Elicit today on Twitter; explore our vision in the roadmap.)

About the role

As a Machine Learning Engineer at Elicit, you’ll build products and workflows that help researchers and scientific teams make higher quality decisions with language models.

This is not a role for someone who only wants to develop models in isolation from user impact. A large part of the work is software engineering: building product experiences, APIs, data integrations, evaluation systems, and reliable harnesses that make language models reliably useful and trustworthy in high-stakes domains.

You’ll work on problems like:

  • Turning messy, ambiguous research tasks into clear product experiences

  • Building interfaces and artifacts that help users understand, trust, and act on model outputs, thinking beyond the chat interface while leveraging full model capabilities

  • Combining language models with external tools, structured and unstructured data, and retrieval systems

  • Improving quality through building careful evaluations, truth-conducive model environments and tools, and targeted ML modeling where the impact is high

What you’ll build
  • Agentic harnesses for target assessment, evidence synthesis, and experiment planning that allow models to provide guarantees about their processes

  • Data integrations across literature, scientific databases, customer data, and internal tools

  • APIs that customers can use in their own systems

  • Evaluation systems that help us understand whether a change actually improves user outcomes

  • Trust and transparency features, like source-quality signals, intermediate reasoning, and better ways to inspect and fix outputs

Example projects

Examples of projects you could work on:

  • Build a target-assessment workflow that combines literature, genetics, chemistry, clinical, regulatory, and company data into a shareable artifact.

  • Build experiment-planning and iteration tools that help researchers decide what to do next and learn from new results.

  • Build evidence-monitoring workflows that keep teams up to date through alerts, briefs, and living reports.

  • Build enterprise APIs and structured-output pipelines that plug Elicit into customers’ internal systems.

  • Build interfaces that make it easier to inspect, trust, and correct model outputs.

  • Build workflow-specific evals and quality systems that tell us whether a product change actually helped users.

  • Improve extraction, reasoning, or search quality with better prompts, better system design, or finetuning when appropriate.

What you bring
  • A strong software engineering background and can build end-to-end systems, not just scripts or notebooks

  • Fluency with language models to reason well about prompting, retrieval, evals, failure modes, and where (and how) finetuning is or isn’t worth it

  • Strong product sense and likes turning fuzzy user problems into concrete things people can use

  • An excitement to solve difficult, creative problems rather than narrow optimization on well-defined benchmarks

  • Ability to move across backend, data, and model layers as needed

  • Clear communication with product, design, domain experts, and other engineers

  • Ability to use coding assistants effectively and thoughtfully, and has adapted their workflow to become much more effective with them

To get a sense for how some of us look at applications, see this thread. (The short version: Wherever we can, we prefer to directly evaluate work.)

You’ll thrive here if you:
  • Like shipping user-facing things quickly

  • Enjoy working on ambiguous problems with a lot of autonomy

  • Care about product quality and user trust, not just technical novelty

  • Want to build new kinds of software made possible by language models

  • Are excited to use AI tools as part of your daily engineering workflow, while still applying strong judgment

What we’re not looking for:

This is probably not the right role if you mainly want to:

  • do low-level model systems work like CUDA optimization or model serving infrastructure as your primary focus

  • work only on research experiments without owning production systems

  • optimize benchmark numbers without much connection to user workflows or product outcomes

We do care about model quality, evals, and sometimes finetuning. But those matter because they help us build products users can rely on, not as ends in themselves.

Am I a good fit?

Consider these questions:

  1. How does a transformer work?

  2. What is a tokenizer?

  3. What is a decorator in Python?

  4. What are generic types?

Strong applicants will find it easy to answer these questions.

Location and travel

We have a lovely office in Oakland, CA, but we also have remote employees across the US. It's important to us to spend time with our teammates, so we ask that all Elicians come together for a quarterly team retreat, normally in or around the SF bay area.

Benefits

In addition to working on important problems as part of a happy, productive, and positive team, we also offer great benefits (with some variation based on work location):

  • Flexible work environment - work from our office in Oakland or remotely as long as you can travel to work in-person for retreats and coworking events

  • Fully covered health, dental, vision, and life insurance for you, generous coverage for the rest of your family

  • Flexible vacation policy, with a minimum recommendation of 20 days/year + company holidays

  • 401K with a 6% employer match

  • A new Mac + $1,000 budget to set up your workstation or home office in your first year, then $500 every year thereafter

  • $1,000 quarterly AI Experimentation & Learning budget, so you can freely experiment with new AI tools to incorporate into your workflow, take courses, purchase educational resources, or attend AI-focused conferences and events

  • A team administrative assistant that you can delegate personal and work tasks to

  • Commuter benefits, a relocation bonus, and more!

  • You can find more reasons to work with us in this thread.

Compensation

For all roles at Elicit, we use a data-backed compensation framework to make sure our salaries are market-competitive, equitable, and simple. For this role, we're targeting starting ranges of:

  • Career (L3): $185-230K + equity

  • Senior (L4): $230-260K + equity

  • Expert/Staff (L5): $255-340K + significant equity

We're optimizing for a hire who can contribute at a L4/senior-level or above. We'd love to meet staff/principal level contributors as well.

We also offer above-market equity for all roles at Elicit, as well as employee-friendly equity terms.

Join us!

 

Top Skills

Language Models
Machine Learning
Natural Language Processing
Python
Semantic Search
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Oakland, California
15 Employees
Year Founded: 2023

What We Do

Elicit, the AI research assistant, helps you automate time-consuming research tasks like summarizing papers, extracting data, and synthesizing your findings.We're a public benefit company with a mission is to scale up good reasoning. We want machine learning to help as much with thinking and reflection as it does with tasks that have clear short-term outcomes.

Similar Jobs

General Motors Logo General Motors

Machine Learning Engineer

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Remote or Hybrid
3 Locations
165000 Employees
180K-284K Annually

Optum Logo Optum

Machine Learning Engineer

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office or Remote
San Diego, CA, USA
160000 Employees
92K-164K Annually

Optum Logo Optum

Machine Learning Engineer

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office or Remote
San Diego, CA, USA
160000 Employees
113K-193K Annually

ServiceNow Logo ServiceNow

Machine Learning Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
28000 Employees
201K-352K Annually

Similar Companies Hiring

GC AI Thumbnail
Artificial Intelligence • Legal Tech
San Mateo, California
80 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account