Principal Research Engineer, Gemini Evals

Sorry, this job was removed at 08:06 a.m. (CST) on Monday, Mar 30, 2026
Be an Early Applicant
Mountain View, CA, USA
In-Office
Artificial Intelligence
The Role

Snapshot 

This role is for a Principal level Research Engineer to lead the strategic development and execution of robust data pipelines, evaluation frameworks, and metric systems for the Gemini family of models and their associated product applications. As a key technical leader and individual contributor, you will apply deep expertise in large-scale machine learning, statistical rigor, and scalable engineering to ensure the safety, performance, and ethical alignment of our frontier AI systems before and after deployment.

About us

Artificial Intelligence could be one of humanity's most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts, and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

This role is part of the Gemini Evaluation research teams. The Gemini Evals team defines success for Gemini, establishes metrics to track progress, and provides clear, actionable insights to guide development.  As a Research Engineer on this team, you will be at the forefront of building the data and evaluation systems that ensure the safety and quality of the Gemini family of models. 

The Role

As a Principle Research Engineer, you will operate as a technical expert and leader within the Gemini Data and Evaluation team. Your primary focus will be to architect and execute the rigorous evaluation and data systems that underpin all major model release and product launch decisions for Gemini.

This is a highly cross-functional role requiring a blend of deep ML research, world-class software engineering, and strategic influence. You will define the data strategy for critical evaluation campaigns, design novel metrics to measure safety and performance at scale, and mentor a team of engineers and researchers to build high-quality, reproducible systems. You will be accountable for communicating complex evaluation results directly to leadership stakeholders to guide the responsible deployment of our most advanced AI technology.

Key responsibilities

Technical Leadership & Strategy

  • Work on post-training evaluation and fine-tuning of large-scale models to improve performance and safety.
  • Define and champion the technical roadmap for large-scale data and evaluation supporting the Gemini model family and its real-world applications 
  • Drive the research of novel, high-signal evaluation methods (automated, human-in-the-loop, and adversarial) to measure model capabilities, alignment, safety, and trustworthiness.
  • Actively contribute to the broader scientific community by presenting findings on cutting-edge AI evaluation and safety methods.

About You

In order to set you up for success as a  at Google DeepMind,  we look for the following skills and experience:

  • 10+ years of experience in researching engineering, with at least 5 years in a technical leadership role.
  • Experience with large-scale machine learning systems, data processing pipelines and evaluation methodologies.
  • Experience with large language models (LLMs) and their evaluation.
  • Experience in post-training evaluation research 

Similar Jobs

Tempus AI Logo Tempus AI

Director of Data & AI Partnerships, Life Sciences

Artificial Intelligence • Big Data • Healthtech • Machine Learning • Analytics • Biotech • Generative AI
Hybrid
4 Locations
3775 Employees
140K-210K Annually

CrowdStrike Logo CrowdStrike

Data Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
USA
10000 Employees
195K-290K Annually

CrowdStrike Logo CrowdStrike

Data Engineering Manager

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
USA
10000 Employees
125K-180K Annually

CrowdStrike Logo CrowdStrike

Operations Manager

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
USA
10000 Employees
125K-180K Annually
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
1,218 Employees
Year Founded: 2010

What We Do

We’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority. Our long term aim is to solve intelligence, developing more general and capable problem-solving systems, known as artificial general intelligence (AGI). Guided by safety and ethics, this invention could help society find answers to some of the world’s most pressing and fundamental scientific challenges. We have a track record of breakthroughs in fundamental AI research, published in journals like Nature, Science, and more.Our programs have learned to diagnose eye diseases as effectively as the world’s top doctors, to save 30% of the energy used to keep data centres cool, and to predict the complex 3D shapes of proteins - which could one day transform how drugs are invented.

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account