Research Engineer - Scalable Interpretability

Posted 5 Days Ago
Be an Early Applicant
San Francisco, CA, USA
In-Office
250K-500K Annually
Senior level
Artificial Intelligence • Software
The Role
Develop and train scalable interpretability assistants to predict and detect subtle model behaviors. Create diverse evaluations, design novel architectures and objectives, and scale training/inference pipelines to support up to 1T‑scale models. Collaborate closely with a research team and contribute to high-impact evaluations used for industry standards and regulation.
Summary Generated by Built In
Salary range: $250,000 - $500,000/year + benefits

Description: Transluce is a non-profit research lab building tools for scalable, end-to-end oversight of AI systems. We build world-class, AI-backed analysis tools and use these to set industry standards for evaluation. Our tools are integrated with core agent benchmarks like SWE-bench, while our evaluations are directly underpinning regulation, including our role as EU AI Office’s main evaluation developer for harmful manipulation risks.

About the role: We are looking for strong scientists and engineers to help advance our vision of scalable end-to-end oversight assistants, building on our recent advances such as predictive concept decoders and user model extractors. As part of our highly collaborative team, you will learn and grow quickly, creating technology at the frontier of AI research and with high direct impact.

Core responsibility: Help us develop and train scalable interpretability assistants that can predict and detect unexpected and subtle behaviors from models’ activations. This includes:
  • Creating diverse evaluations that range in difficulty. This involves finding naturally occurring interesting and undesirable behaviors exhibited by open-source models.
  • Developing novel architectures and objectives for training interpretability assistants.
  • Scaling up the training and inference pipelines to support up to 1T-scale models.

Qualities of a strong candidate:
  • Experience with fine-tuning language models, designing new architectures, and creating evaluations.
  • Reliable results: good experimental design, epistemic self-awareness and transparency
  • Generativeness: coming up with original, productive ideas for unblocking progress
  • Curiosity: a desire to understand ML systems and how they work
  • Strong programming ability, including navigating trade-offs between prototyping speed and maintainability
  • Strong communication skills, low ego, openness to giving and receiving feedback

We are located in San Francisco and enthusiastic to work together in-person. We are open to sponsoring international visas.

Skills Required

  • Experience fine-tuning language models and designing new model architectures
  • Experience creating diverse evaluations to find undesirable behaviors in open-source models
  • Experience scaling training and inference pipelines to support very large (up to 1T-scale) models
  • Reliable experimental design, epistemic self-awareness, and transparency in results
  • Strong programming ability and pragmatic trade-offs between prototyping speed and maintainability
  • Generative problem-solving and original idea generation to unblock progress
  • Curiosity about ML systems and strong communication skills, openness to feedback
  • Willingness to work in-person in San Francisco (organization enthusiastic about in-person collaboration)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, California
20 Employees
Year Founded: 2024

What We Do

Transluce is an independent research lab that builds open, scalable technology for understanding AI systems and steering them in the public interest. Transluce means to shine light through something to reveal its structure. Today’s complex AI systems are difficult to understand—not even experts can reliably predict their behavior once deployed. Given AI's extraordinary consequences on society, we need scalable and open analyses of the capabilities and risks of AI systems. We are building open source, AI-driven tools to understand and analyze AI systems. We will apply these tools to open-weight models, so the world can vet our analyses and improve their reliability. Once our technology has been vetted, we will work with frontier AI labs and governments to ensure that internal assessments reach the same standards as our publicly vetted procedures. Email: [email protected]

Similar Jobs

Magnite Logo Magnite

Senior Accountant

AdTech • Big Data • Digital Media • Software
Hybrid
Los Angeles, CA, USA
950 Employees
95K-105K Annually

Atlassian Logo Atlassian

Technical Revenue Accounting Sr. Manager

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
In-Office or Remote
San Francisco, CA, USA
11000 Employees

Cox Enterprises Logo Cox Enterprises

Client Service Quality Supervisor (Manheim)

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Hybrid
Riverside, CA, USA
50000 Employees
73K-110K Annually
Hybrid
6 Locations
1100 Employees
189K-351K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account