TensorStax

Research Engineer, Reinforcement Learning

Reposted 3 Days Ago

Be an Early Applicant

San Francisco, CA, USA

In-Office

Mid level

Artificial Intelligence • Software • Automation

The Role

As a Research Engineer specializing in Reinforcement Learning, you'll develop, refine, and evaluate RL techniques and datasets for optimizing language model behaviors in data environments.

Summary Generated by Built In

Research Engineer – Reinforcement Learning

Location: San Francisco (Hybrid)

About TensorStax

TensorStax is building fully autonomous AI systems to manage and maintain mission-critical data infrastructure and pipelines. We leverage reinforcement learning to enhance language models' ability to reason over large-scale data lakes and warehouses, detect pipeline failures, construct new pipelines with high precision, and enable agentic behavior—allowing systems to proactively identify and resolve issues autonomously.

As a Research Engineer specializing in Reinforcement Learning, you will:

Develop and refine reward functions to optimize agent behavior for complex data engineering tasks.
Create RL gym environments for language model agents.
Fine-tune language models using reinforcement learning techniques such as PPO, DPO, and KTO.
Stay at the forefront of research on RL for language models, incorporating advancements like GRPO, SWE-Gym, and SWE-RL into practical applications.
Curate and build high-quality datasets for supervised fine-tuning (SFT) and RLHF.
Design experiments to evaluate and improve the agentic capabilities of language models in data environments.

What We’re Looking For:

Deep understanding of reinforcement learning, reward shaping, and optimization strategies.
Strong familiarity with LLM fine-tuning techniques (PPO, DPO, KTO) and their applications in reinforcement learning.
Knowledge of recent advancements in RL for language models (GRPO, SWE-Gym, SWE-RL).
Experience curating and constructing high-quality datasets for fine-tuning.
Strong problem-solving skills and a history of working on complex ML projects.
High agency—ability to work independently, experiment proactively, and drive research initiatives forward.

Bonus Points: