Mercor

Research Engineer - Environments, Data and Post-Training

Reposted 22 Days Ago

San Francisco, CA, USA

In-Office

Junior

Artificial Intelligence • Software

We use AI to understand human ability and match talent with the opportunities they're best suited for.

The Role

As a Research Engineer, you'll enhance AI models through post-training and evaluation pipelines, focusing on data quality and performance metrics.

Summary Generated by Built In

About Mercor

Mercor's mission is to organize human intelligence to power the AI economy. We're a leading AI data company, building the layer between human expertise and frontier models. Millions of domain experts on the platform are paid over $4 million per day to train frontier AI models. Mercor's APEX benchmark family measures AI's real-world impact on professional work. Mercor Enterprise brings this same infrastructure to Fortune 500 companies: helping companies capture how their best people actually work, translating that expertise directly back into agents.

Mercor is creating a new category of work where expertise powers AI advancement. Achieving this requires an ambitious, fast-paced and deeply committed team. You’ll work alongside researchers, operators, and AI companies at the forefront of shaping the systems that are redefining society. Mercor is a profitable Series C company valued at $10 billion. We work in-person five days a week in our San Francisco, NYC, or London offices.

About the Role

As a Research Engineer at Mercor, you’ll work at the intersection of engineering and applied AI research. You’ll contribute directly to post-training and RLVR, synthetic data generation, and large-scale evaluation workflows that meaningfully impact frontier language models.

Your work will be used to train large language models to master tool use, agentic behavior, and real-world reasoning in real-world production environments. You’ll shape rewards, run post-training experiments, and build scalable systems that improve model performance. You’ll help design and evaluate datasets, create scalable data augmentation pipelines, and build rubrics and evaluators that push the boundaries of what LLMs can learn.

What You’ll Do

Work on post-training and RLVR pipelines to understand how datasets, rewards, and training strategies impact model performance.
Design and run reward-shaping experiments and algorithmic improvements (e.g., GRPO, DAPO) to improve LLM tool-use, agentic behavior, and real-world reasoning.
Quantify data usability, quality, and performance uplift on key benchmarks.
Build and maintain data generation and augmentation pipelines that scale with training needs.
Create and refine rubrics, evaluators, and scoring frameworks that guide training and evaluation decisions.
Build and operate LLM evaluation systems, benchmarks, and metrics at scale.
Collaborate closely with AI researchers, applied AI teams, and experts producing training data.
Operate in a fast-paced, experimental research environment with rapid iteration cycles and high ownership.

What We’re Looking For

Strong applied research background, with a focus on post-training and/or model evaluation.
Strong coding proficiency and hands-on experience working with machine learning models.
Strong understanding of data structures, algorithms, backend systems, and core engineering fundamentals.
Familiarity with APIs, SQL/NoSQL databases, and cloud platforms.
Ability to reason deeply about model behavior, experimental results, and data quality.
Excitement to work in person in San Francisco, five days a week (with optional remote Saturdays), and thrive in a high-intensity, high-ownership environment.

Nice To Have

Real-world post-training team experience in industry (highest priority).
Publications at top-tier conferences (NeurIPS, ICML, ACL).
Experience training models or evaluating model performance.
Experience in synthetic data generation, LLM evaluations, or RL-style workflows.
Work samples, artifacts, or code repositories demonstrating relevant skills.

Benefits

Bi-annual performance bonus structure
Generous equity grant vested over 4 years
Up to $15k Relocation bonus
$10K housing bonus (if you live within 0.5 miles of our office)
$1.5K monthly stipend for meals
Free Equinox membership
$200 monthly laundry reimbursement
$200 monthly personal wellness reimbursement
Health, Dental, Vision insurance

Skills Required

Strong applied research background focused on post-training and model evaluation
Strong coding proficiency and hands-on experience with machine learning models
Understanding of data structures, algorithms, backend systems, and core engineering
Familiarity with APIs, SQL/NoSQL databases, and cloud platforms
Ability to reason deeply about model behavior and experimental results

Mercor Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Mercor and has not been reviewed or approved by Mercor.

Fair & Transparent Compensation — Pay is considered competitive across many roles, with clear hourly ranges and an hourly/pay‑per‑task mix designed to align rates with expertise. The structure emphasizes transparent, appropriate pay levels and guarantees payment for legitimate logged time.
Strong & Reliable Incentives — Payments are processed on a predictable weekly cadence via Stripe/Wise, and some tracks offer additional weekly bonus incentives for top performers. This combination of regular payouts and performance bonuses supports dependable earnings when projects are active.
Equity Value & Accessibility — Select full‑time roles include generous equity grants alongside cash perks such as relocation and housing bonuses. These elements increase total compensation for those positions.