Research Scientist - Frontier Data

Posted 3 Days Ago
San Francisco, CA, USA
In-Office
250K-450K Annually
Junior
Artificial Intelligence • Big Data
The Role
Design datasets and evaluation frameworks for foundation models, run experiments exposing failure modes across domains, improve annotator behavior, and develop quantitative metrics for dataset quality and downstream model impact. Partner with lab research teams to translate objectives into data and evaluation specs for RLHF/RLVR and LLM training pipelines.
Summary Generated by Built In
About AfterQuery

AfterQuery is an applied research lab curating data solutions for foundation model development.

We serve every frontier AI lab with the mission of delivering the best data to power the best models. In doing so, we can make expertise that once took a lifetime to build available to anyone who needs it. Our customers are the ones building the foundation models themselves and our work sits directly in the loop of how those systems improve.

This is a rare opportunity to join a company at a defining moment in AI. Since raising our $30M Series A at a $300M valuation, AfterQuery has grown well over a $100M revenue run rate.

We're based in San Francisco and backed by leading investors including Altos Ventures, BoxGroup, and Y Combinator and angels from Google DeepMind, OpenAI, Anthropic, Meta Superintelligence Labs, and Microsoft AI.

About AfterQuery

AfterQuery builds the training data and evaluation infrastructure that frontier AI labs use to make their models better. We work with the world's leading labs to design high signal datasets and run rigorous evaluations that go beyond static benchmarks. We are a small, early team (post Series A) where individual contributors have a direct impact on how the next generation of models learn and improve.

The Role

You'll design the datasets and evaluation frameworks that shape how frontier models are trained and measured. Working directly with research teams at top AI labs, you'll experiment with data collection strategies, diagnose model failure modes, and develop the metrics that determine whether a model is actually getting better. This is hands-on, high leverage work: you'll go from hypothesis to live experiment quickly, and your output will directly influence model training runs at scale.

What You'll Do

  • Design data slides and explore data shapes that expose meaningful model failure modes across domains like finance, code, and enterprise workflows

  • Build and refine evaluation rubrics and reward signals for RLHF and RLVR training pipelines

  • Model annotator behavior and run experiments to improve different model capabilities

  • Develop quantitative frameworks for measuring dataset quality, diversity, and downstream impact on model alignment and capability

  • Partner with lab research teams to translate their training objectives into concrete data and evaluation specifications

What We're Looking For

  • Great candidates are undergrad research or master's research (but haven't done a phd)

  • Major plus if they've worked for/interned for any RL environment companies in the past or any AI safety or benchmarking orgs like METR, Artificial Analysis, etc..

  • Genuine obsession with how data structure, selection, and quality drive model behavior

  • Ability to design lightweight experiments, move fast, and extract actionable insights from messy results

  • Comfort working across domains (you'll touch finance, software engineering, policy, and more)

  • Strong quantitative instincts and familiarity with LLM training pipelines, RLHF/RLVR, or evaluation methodology

  • A bias toward building over theorizing

Compensation Structure:

  • Annual target cash compensation of $250-450K + meaningful equity

  • Comprehensive benefits (UberEats and ride share stipend, comped Equinox, 401K with match, health, dental, and vision insurance)

Skills Required

  • Undergraduate or Master's research experience (PhD not required or expected)
  • Experience designing datasets and evaluation frameworks for model training
  • Familiarity with LLM training pipelines, RLHF or RLVR, or evaluation methodology
  • Strong quantitative instincts and ability to analyze messy experiment results
  • Ability to design lightweight experiments quickly and extract actionable insights
  • Comfort working across domains (finance, software engineering, policy, etc.)
  • Obsessive focus on data structure, selection, and quality driving model behavior
  • Bias toward building and practical implementation over theorizing
  • Experience at RL environment companies or AI safety/benchmarking organizations (e.g., METR, Artificial Analysis)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
200 Employees

What We Do

AfterQuery is an applied research lab curating data solutions to accelerate foundation model development.

Similar Jobs

ServiceNow Logo ServiceNow

Director, Technical Program Managment

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
San Diego, CA, USA
29000 Employees
199K-348K Annually

ServiceNow Logo ServiceNow

Manager, Software Engineering Management - ITAM

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
29000 Employees
167K-291K Annually

ServiceNow Logo ServiceNow

Senior Manager, Product Marketing - Telecommunications

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
29000 Employees
166K-290K Annually

ServiceNow Logo ServiceNow

Principal Software Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
29000 Employees
221K-387K Annually

Similar Companies Hiring

Legora Thumbnail
Artificial Intelligence • Legal Tech • Software
Chicago, Illinois
700 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account