Applied AI Researcher, Benchmarking

Reposted 21 Days Ago
Be an Early Applicant
2 Locations
In-Office
Mid level
Artificial Intelligence • Software
The Role
The Applied AI Researcher will design benchmark frameworks, conduct statistical evaluations, and leverage AI to redefine enterprise software usage and measure intelligent system performance.
Summary Generated by Built In

Distyl AI develops AI native technologies for humans & AI to collaborate to power the operations of the Global Fortune 1000.

In just 24 months, we’ve rapidly grown to partner with some of the world’s largest enterprises—including F100 telecom, healthcare, manufacturing, insurance, and retail companies—delivering multiple AI deployments with $100M+ impact. Our platform, Distillery, along with our team of AI Engineers, Researchers, and Strategists, is pioneering AI-native systems of work, solving the most complex, high-stakes challenges at scale.

Distyl is founded and led by proven leaders from companies like Palantir, Apple, and top national laboratories. We work in deep partnership with OpenAI, jointly going-to-market at the largest enterprises and collaborating evaluating and testing the latest models. Backed by Lightspeed, Khosla, Coatue, industry leaders like Nat Friedman (former GitHub CEO), as well as board members of over 20+ F500s, Distyl is building the future of AI-powered enterprise operations.

What We Are Looking For

At Distyl we’re pushing the envelope of AI utilization in enterprise. This requires creative researchers who don’t just want to drive incremental improvements on benchmarks or optimize an existing process but instead are looking to creatively redefine how software is used.

Our researchers come from many academic backgrounds but have strong research track records, operate in an AI-native way, and would be bored staying on the rails of a traditional research org.

Key Responsibilities
  • The Benchmarking team defines how progress is measured. Researchers design evaluation frameworks that capture reasoning depth, interaction quality, reliability, and operational impact. They construct benchmarks that reflect real-world complexity. Their systems become the standard by which new architectures, techniques, and releases are judged.

  • Researchers in Benchmarking explore new paradigms for evaluating intelligent systems: adversarial robustness testing, longitudinal performance tracking, and human-in-the-loop assessment. They investigate how metrics shape model behavior and establish rigorous methodologies for quantifying emergent capability. Their insights drive both Distyl’s internal research priorities and industry-wide standards.

What We Require
  • Experience Designing and Running Evaluations: You’ve built or maintained benchmarks, test suites, or experimental frameworks to measure model or system performance.

  • Statistical and Analytical Rigor: You design fair, reproducible experiments and can extract signal from noisy empirical results.

  • Experience Building with Models, Not Just Building Models: We develop intelligent systems using models rather than training or fine-tuning them. Ideal candidates have expertise in compound AI systems, agentic collaboration, and associated techniques (ensembling, ReAct, graph-of-thoughts, etc.).

  • Proven Track Record of Research Results: Whether you’ve published in top journals, posted amazing work on twitter, or somewhere else we want to see what you've done.

  • Uses AI Every Day: Before you can revolutionize someone else’s workflow, you need to revolutionize yours. You should be using tools like ChatGPT, Cursor, and Perplexity to accelerate your workflow.

  • Strong Programming and Data Analysis Skills: While you might not consider yourself a software engineer you need to be able to build prototypes of your ideas and then perform the experiments to prove the effectiveness to a F500 Head of AI.

  • Biases Towards Showing vs Telling: Our customers want to see the power of AI today vs discuss the most elegant idea that will take 5 years to realize.

What We Offer
  • An opportunity to advance the cutting edge of LLM research and directly revolutionize work in the enterprise space.

  • Ownership of high-impact research projects, with the autonomy to explore novel approaches and solutions.

  • Access to state-of-the-art AI models, real business problems, and proprietary data sets across a diverse range of real-world industries.

  • Competitive salary and benefits package, including equity options, medical/dental/vision covered at 100% for you and your dependents, 401K plan, and perks such as commuter benefits and lunch provided in office.

  • Be part of a mission-oriented company to create practical adoption during the biggest revolution in human productivity.

  • A collaborative and intellectually stimulating environment that encourages innovation and personal growth.

If you are an innovative, ambitious, and driven individual looking to make a difference in the world of AI, we want to hear from you. Apply now to join our team as an Applied AI Researcher and help us shape the future of AI-driven solutions for enterprises across the globe.

Note: Distyl is a hybrid working environment and requires in office collaboration 3 days a week. We have offices in SF and NYC

Top Skills

AI
Data Analysis
Programming
Statistical Analysis
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, California
45 Employees

What We Do

Distyl AI is on a mission to create the most customer-centric AI company that revolutionizes how enterprises thrive in the AI-assisted economy. We collaborate with leading institutions worldwide to enhance their AI readiness and build dependable, seamlessly integrated AI-driven solutions tailored to their distinct data, workflows, and employee requirements. Using our proprietary platform of in-house tools and alliances such as the one with OpenAI, our team diligently develops and deploys generative AI products that adhere to the highest standards of integrity and reliability, empowering the institutions that require them the most.

Similar Jobs

Zocdoc Logo Zocdoc

Lead People Data Analytst

Healthtech • Information Technology • Software • Telehealth
Easy Apply
Hybrid
New York, NY, USA
900 Employees
145K-180K Annually

New York Life Insurance Company Logo New York Life Insurance Company

Vice President, Head of Financial Risk Management

Artificial Intelligence • Cloud • Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
In-Office
New York, NY, USA
34623 Employees
250K-285K Annually

New York Life Insurance Company Logo New York Life Insurance Company

Corporate Vice President, Product Development Leader - Whole Life & Term

Artificial Intelligence • Cloud • Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
In-Office
New York, NY, USA
34623 Employees
156K-223K Annually

New York Life Insurance Company Logo New York Life Insurance Company

Consultant

Artificial Intelligence • Cloud • Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
In-Office
2 Locations
34623 Employees
85K-120K Annually

Similar Companies Hiring

PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account