Research Engineer, Reinforcement Learning

Reposted 12 Days Ago
Be an Early Applicant
San Francisco, CA, USA
In-Office
Mid level
Artificial Intelligence • Software • Automation
The Role
As a Research Engineer specializing in Reinforcement Learning, you'll develop, refine, and evaluate RL techniques and datasets for optimizing language model behaviors in data environments.
Summary Generated by Built In
Research Engineer – Reinforcement Learning
Location: San Francisco (Hybrid)

About TensorStax
TensorStax is building fully autonomous AI systems to manage and maintain mission-critical data infrastructure and pipelines. We leverage reinforcement learning to enhance language models' ability to reason over large-scale data lakes and warehouses, detect pipeline failures, construct new pipelines with high precision, and enable agentic behavior—allowing systems to proactively identify and resolve issues autonomously.

As a Research Engineer specializing in Reinforcement Learning, you will:

  • Develop and refine reward functions to optimize agent behavior for complex data engineering tasks.
  • Create RL gym environments for language model agents.
  • Fine-tune language models using reinforcement learning techniques such as PPO, DPO, and KTO.
  • Stay at the forefront of research on RL for language models, incorporating advancements like GRPO, SWE-Gym, and SWE-RL into practical applications.
  • Curate and build high-quality datasets for supervised fine-tuning (SFT) and RLHF.
  • Design experiments to evaluate and improve the agentic capabilities of language models in data environments.

What We’re Looking For:

  • Deep understanding of reinforcement learning, reward shaping, and optimization strategies.
  • Strong familiarity with LLM fine-tuning techniques (PPO, DPO, KTO) and their applications in reinforcement learning.
  • Knowledge of recent advancements in RL for language models (GRPO, SWE-Gym, SWE-RL).
  • Experience curating and constructing high-quality datasets for fine-tuning.
  • Strong problem-solving skills and a history of working on complex ML projects.
  • High agency—ability to work independently, experiment proactively, and drive research initiatives forward.

Bonus Points:

  • Experience with distributed training in PyTorch (DDP, FSDP).
  • Hands-on experience designing RL environments for traditional RL problems.
  • Contributions to open-source projects in RL, LLMs, or ML infrastructure.
  • Familiarity with data lakes and warehouses (Snowflake, BigQuery, Redshift).

Benefits:

  • 100% employer-covered health, dental, and vision insurance.
  • 401(k) with company match.
  • Access to Bay Club or Equinox in San Francisco.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
4 Employees

What We Do

Autonomous AI to help build and maintain data pipelines using your infrastructure.

Similar Jobs

1X Technologies Logo 1X Technologies

AI Research Engineer - Reinforcement Learning

Artificial Intelligence • Robotics • Automation • Manufacturing
In-Office
San Carlos, CA, USA
1021 Employees
180K-250K Annually
In-Office or Remote
2 Locations
28 Employees
180K-290K Annually
In-Office
Sunnyvale, CA, USA
472 Employees
126K-423K Annually

Anthropic Logo Anthropic

Research Engineer, Cybersecurity Reinforcement Learning

Artificial Intelligence • Natural Language Processing • Generative AI
In-Office
2 Locations
2500 Employees
300K-405K Annually

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account