Patronus AI is building hyperrealistic, diverse reinforcement learning (RL) environments and benchmarks to steer frontier AI models. Our mission is to enable next generation AGI capabilities through scalable oversight.
We are a team of AI researchers and engineers formerly from companies like Meta AI, Amazon AGI, and Google. As a team, we have published research papers at top ML conferences (NeurIPS, EMNLP, ACL), and we are the creators of popular AI products and benchmarks like FinanceBench, SimpleSafetyTests, CopyrightCatcher, Humanity’s Last Exam, and more. Our customers include foundation model labs and Fortune 500 enterprises like The Volkswagen Group. We are backed by top-tier investors like Notable Capital, Lightspeed Venture Partners, Stanford University, and leading researchers at OpenAI, Databricks, and more.
As a Research Scientist at Patronus AI, you will be pivotal to solving the most important and challenging open research problems facing society’s adoption of AI today, surrounding AI evaluation, language model understanding and robustness challenges.
In this role, you will:
- Develop state-of-the-art systems for AI evaluation. Implement algorithms and models based on state-of-the-art NLP advancements, especially in the areas of evaluation and LLM alignment.
- Conduct novel research on redteaming language models, automated evaluation and alignment.
- Scope out and lead research projects, including experiment design, timelines for research deliverables, understanding results.
- Develop processes for high quality research, including dataset collection, model training, benchmarking and inference.
- Experiment with latest technologies and proactively suggest experiments and improvements to research and ML systems. Adapt to changes in generative AI landscape, and incorporate new models into the platform when applicable.
- Assist in the construction of high quality, novel datasets for classification and generative tasks, through synthetic data augmentation techniques and publicly available datasets.
- Contribute to research to production efforts that advance product offerings.
- Collaborate closely with product and engineering in our globally-based team.
"The number one qualification to succeed in this machine learning course is gumption” - John Lafferty, CS Professor at Yale
Above all, we look for a proactive mindset, willingness to learn, relentless drive, and passion for working hands-on with customers. You are a great fit if you have a background in the following:
- Publications at leading AI conferences, journals or workshops, such as NeurIPS, ICML, EMNLP, ACL, AAAI.
- Experience conducting empirical NLP research in an academic or industry research lab.
- Knowledge and understanding of state-of-the-art machine learning concepts, with a focus on NLP. Familiarity with transformer-based architectures, attention mechanisms, evaluation metrics and benchmarks.
- Experience training language models in applied or research settings.
- Experience working and communicating cross functionally in a team environment.
- Creativity in problem solving and strong communication skills.
- Have good character, integrity and respect for others.
- Competitive salary and equity packages
- Health, dental, and vision insurance plans
- 401(k) plan
- Monthly meal stipend
- Monthly health and wellness stipend
- Fun global offsites!
Patronus AI is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or other legally protected characteristics.
What We Do
Patronus AI is the leading AI evaluation and optimization company. Our research-backed product enables AI engineers to optimize their agents, access powerful evaluation models, and automatically detect LLM system performance issues across 50+ modes. Leading technology companies and enterprises like AngelList, Etsy, and Pearson use Patronus AI to ship top-tier AI products.
Founded by machine learning experts from Meta, Patronus AI is on a mission to accelerate the world's adoption of generative AI. We are backed by Notable Capital, Lightspeed Venture Partners, Stanford University, Datadog, Gokul Rajaram, and leading software and AI executives.







