Senior AI Researcher- Reinforcement learning (f/m/d)

Reposted 7 Days Ago
Be an Early Applicant
Heidelberg, Baden-Württemberg, DEU
Hybrid
Senior level
Artificial Intelligence • Information Technology • Internet of Things
The Role
As a Senior AI Researcher, you will enhance reinforcement learning methodologies, conduct large-scale experiments, and collaborate cross-functionally to improve model capabilities.
Summary Generated by Built In
Our Mission

Aleph Alpha is one of the few companies in Europe with end-to-end in-house model development including pre- and post-training. We’re building models that have general-purpose capabilities, but also specifically excel at addressing the needs of our customers.

We're growing our post-training team in Heidelberg (or hybrid in Germany) and are looking for an AI Researcher who combines a deep theoretical understanding of reinforcement learning methods with a desire to improve on the state of the art and improve model capabilities in large-scale training.

Team Culture

At Aleph Alpha, we foster a culture built on ownership, autonomy, and empowerment. Teams and individual contributors are trusted to take responsibility for their work and drive meaningful impact. We maintain a flat organizational structure with efficient, supportive management that enables quick decision‑making, open communication, and a strong sense of shared purpose.

About the role

As a (senior) AI Researcher for reinforcement learning you will shape and improve the underlying RL methodology, maintain a high-quality training code-base, and conduct large-scale experiments to hill-climb our performance benchmarks. This role is for you if you both have a strong theoretical background on RL and the engineering drive to bring these methods into production and improve on the methods as part of the reinforcement learning team.

In your day-to-day you will conduct large-scale reinforcement learning experiments, derive hypotheses from the results, and iterate on both the implementation and methodology based on the observations. Together with a collaborative team, you will have direct impact on the models that we ship to our customers.

This role is for Aleph Alpha Research GmbH.

Your Responsibilities
  • Hill-climb in large-scale training: Conduct large-scale LLM training runs, analyze evaluation scores in depth, propose hypotheses for improvement and directly implement them in order to maximize performance on our benchmarks.

  • Theoretical innovation: Stay at the bleeding edge of RL research. You will identify, implement, and iterate on novel approaches to multi-turn reinforcement learning.

  • Scale our training infrastructure: Identify bottlenecks in our training setup and optimize our RL training loops for large-scale training.

  • Cross-functional collaboration: Partner with our other post-training teams to turn raw feedback into actionable training signals, ensuring that our RL iterations lead to measurable improvements in downstream performance.

Your Profile

Basic Qualifications

  • A deep understanding of Reinforcement Learning theory and how it relates to modern RL methods.

  • Experience with multi-node LLM training (ideally using RL). You understand how to scale multi-node RL trainings and can reason about and implement distributed algorithms.

  • Familiarity with statistical methods for evaluation and experiment design.

  • Ability to reason about what an evaluation/environment measures and whether it matters - not just run benchmarks, but understand them.

  • Strong Python skills and comfort with ML tooling (especially torch distributed)

  • Willingness to relocate to Heidelberg or travel regularly (potentially weekly).

Preferred Qualifications

  • PhD in reinforcement learning or equivalent research experience.

  • A history of contributions to top-tier venues (NeurIPS, ICML, ICLR, etc.) specifically regarding RL.

  • Experience evaluating LLM models and crafting environments for training.

Compensation and Benefits
  • Become part of an AI revolution!

  • 30 days of paid vacation

  • Access to a variety of fitness & wellness offerings via Wellhub

  • Mental health support through nilo.health

  • Substantially subsidized company pension plan for your future security

  • Subsidized Germany-wide transportation ticket

  • Budget for additional technical equipment

  • Flexible working hours for better work-life balance and hybrid working model

  • Virtual Stock Option Plan

  • JobRad® Bike Lease

Skills Required

  • Deep understanding of Reinforcement Learning theory
  • Experience with multi-node LLM training ideally using RL
  • Familiarity with statistical methods for evaluation and experiment design
  • Strong Python skills and comfort with ML tooling especially torch distributed
  • Willingness to relocate to Heidelberg or travel regularly
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Baden-Württemberg
254 Employees
Year Founded: 2019

What We Do

We are an AI research and application company that researches, develops and operationalises large-scale AI models for language, image data and strategy, thereby contributing to securing Europe's digital sovereignty

Similar Jobs

Samsara Logo Samsara

Account Development Representative

Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Easy Apply
Remote or Hybrid
Germany
4000 Employees

Motorola Solutions Logo Motorola Solutions

Support Engineer

Artificial Intelligence • Hardware • Information Technology • Security • Software • Cybersecurity • Big Data Analytics
Remote or Hybrid
Germany
23000 Employees

Celonis Logo Celonis

Account Executive

Big Data • Information Technology • Productivity • Software • Analytics • Business Intelligence • Consulting
Remote or Hybrid
Germany
3000 Employees

Tapestry - Coach and Kate Spade Logo Tapestry - Coach and Kate Spade

Supervisor

eCommerce • Fashion • Other • Retail • Sales • Wearables • Design
Hybrid
Wertheim, Baden-Württemberg, DEU
16000 Employees

Similar Companies Hiring

Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account