Research Scientist, Reinforcement Learning

Reposted 18 Hours Ago
Be an Early Applicant
Redwood City, CA
In-Office
250K-290K Annually
Senior level
Software • Generative AI
The Role
A Research Scientist in Reinforcement Learning will design algorithms and training workflows for large language models, collaborate with teams, and improve model performance with innovative RL techniques.
Summary Generated by Built In
About Us:

Here at Fireworks, we’re building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highest-quality models and the fastest, most scalable inference. We’ve been independently benchmarked to have the fastest LLM inference and have been getting great traction with innovative research projects, like our own function calling and multi-modal models. Fireworks is funded by top investors, like Benchmark and Sequoia, and we’re an ambitious, fun team composed primarily of veterans from Pytorch and Google Vertex AI.

The Role:

As a Research Scientist focused on Reinforcement Learning (RL), you’ll apply your deep expertise in the field to push the boundaries of how large language models are trained, aligned, and deployed. We’re looking for someone with a strong foundation in RL - not just familiarity, but hands-on experience designing algorithms, building training pipelines, and running experiments.

You’ll work on everything from scalable RLHF alternatives (e.g., GRPO, DPO) to reward modeling and agent-based training. Your contributions will directly impact Fireworks’ model quality, training workflows, and customer-facing APIs. You’ll also collaborate with researchers, engineers, and product teams to translate state-of-the-art RL into practical systems used by companies deploying LLMs at scale.

Key Responsibilities:
  • Design, implement, and optimize reinforcement learning algorithms to improve the training and alignment of large language models.
  • Develop scalable pipelines for reinforcement learning from human feedback (RLHF) and explore alternatives such as GRPO and DPO.
  • Conduct hands-on experiments across reward modeling, agent-based training, and reinforcement fine-tuning of LLMs.
  • Collaborate with cross-functional teams, including researchers, engineers, and product managers, to integrate cutting-edge RL advancements into production systems.
  • Analyze experimental results and iterate quickly to improve model performance and training workflows.
  • Contribute to the development of Fireworks’ customer-facing APIs by enhancing model alignment and real-world usability.
  • Stay current with the latest research in reinforcement learning, LLM alignment, and AI safety to inform and inspire new initiatives.
Minimum Qualifications:
  • 5+ years of research experience specifically in reinforcement learning
  • Strong understanding of RL fundamentals, including policy gradients, actor-critic methods, offline RL, and preference-based learning
  • Experience with reinforcement fine-tuning of LLMs (e.g., PPO, DPO, GRPO)
  • Experience building and training deep learning models using PyTorch
  • Proficiency in Python and ability to write clean, efficient, research-grade code
  • Demonstrated ability to lead RL experiments from idea to implementation and analysis
  • Excellent communication skills and the ability to collaborate in fast-paced, cross-functional environments
Preferred Qualifications:
  • PhD in Computer Science, Machine Learning, Applied Mathematics, or a related field
  • Publications at top-tier ML conferences (NeurIPS, ICML, ICLR, etc.)
  • Experience building interactive agents that leverage tools, APIs, or search
  • Expertise in reward modeling and LLM evaluation strategies

Total compensation for this role also includes meaningful equity in a fast-growing startup, along with a competitive salary and comprehensive benefits package. Base salary is determined by a range of factors including individual qualifications, experience, skills, interview performance, market data, and work location. The listed salary range is intended as a guideline and may be adjusted.

Base Pay Range (Plus Equity)
$250,000$290,000 USD
Why Fireworks AI?
  • Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
  • Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
  • Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
  • Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.

Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.

Top Skills

Python
PyTorch
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Redwood City, CA
63 Employees
Year Founded: 2022

What We Do

Fireworks.ai offers generative AI platform as a service. We optimize for rapid product iteration building on top of gen AI as well as minimizing cost to serve.

https://fireworks.ai/careers

Similar Jobs

Snowflake Logo Snowflake

Scientist

Artificial Intelligence • Big Data • Cloud • Machine Learning • Software • Database • Analytics
In-Office
Menlo Park, CA, USA
8769 Employees
195K-250K Annually

Crunchyroll Logo Crunchyroll

Creative Director

Digital Media • eCommerce • Gaming • Mobile • News + Entertainment
Hybrid
Los Angeles, CA, USA
1300 Employees
155K-185K Annually

Braze Logo Braze

Senior Full-stack Engineer

Marketing Tech • Mobile • Software
Easy Apply
Hybrid
San Francisco, CA, USA
1918 Employees
155K-275K Annually

Relativity Space Logo Relativity Space

Supply Chain Manager

Aerospace • Hardware • Robotics • Software • Manufacturing
Easy Apply
In-Office
Long Beach, CA, USA
1800 Employees
127K-162K Annually

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account