Forward-Deployed RL Engineer | Ingénieur·e Forward Deployed en apprentissage par renforcement
NBCUniversal is one of the world's leading media and entertainment companies. We create world-class content, which we distribute across our portfolio of film, television, and streaming, and bring to life through our global theme park destinations, consumer products, and experiences. We own and operate leading entertainment and news brands, including NBC, NBC News, NBC Sports, Telemundo, NBC Local Stations, Bravo, and Peacock, our premium ad-supported streaming service. We produce and distribute premier filmed entertainment and programming through our powerhouse film and television studios, including Universal Pictures, DreamWorks Animation, and Focus Features, and the four global television studios under the Universal Studio Group banner, and operate industry-leading theme parks and experiences around the world through Universal Destinations & Experiences, including Universal Orlando Resort, home to Universal Epic Universe, and Universal Studios Hollywood. NBCUniversal is a subsidiary of Comcast Corporation. Visit www.nbcuniversal.com for more information.
Our impact is rooted in improving the communities where our employees, customers, and audiences live and work. We have a rich tradition of giving back and ensuring our employees have the opportunity to serve their communities. We champion an inclusive culture and strive to attract and develop a talented workforce to create and deliver a wide range of content reflecting our world.
NBCUniversal est l'une des principales entreprises mondiales de médias et de divertissement. Nous créons du contenu de calibre mondial, que nous distribuons à travers notre portefeuille de cinéma, de télévision et de diffusion en continu, et que nous faisons vivre par le biais de nos destinations de parcs thématiques mondiaux, de nos produits de consommation et de nos expériences. Nous détenons et exploitons des marques de divertissement et d'information de premier plan, notamment NBC, NBC News, NBC Sports, Telemundo, NBC Local Stations, Bravo et Peacock, notre service de diffusion en continu haut de gamme financé par la publicité.
Nous produisons et distribuons des œuvres cinématographiques et des contenus télévisuels de premier plan grâce à nos puissants studios de cinéma et de télévision, notamment Universal Pictures, DreamWorks Animation et Focus Features, ainsi qu'aux quatre studios de télévision mondiaux regroupés sous la bannière Universal Studio Group. Nous exploitons également des parcs thématiques et des expériences de calibre industriel à travers le monde par l'entremise de Universal Destinations & Experiences, notamment Universal Orlando Resort, domicile de Universal Epic Universe, et Universal Studios Hollywood. NBCUniversal est une filiale de Comcast Corporation. Visitez https://www.nbcuniversal.com pour plus d'information.
Notre impact repose sur l'amélioration des communautés dans lesquelles nos employé• e• s, nos client• e• s et nos publics vivent et travaillent. Nous avons une riche tradition d'engagement communautaire et veillons à ce que nos employé• e• s aient l'occasion de servir leurs communautés. Nous défendons une culture inclusive et nous efforçons d'attirer et de développer une main-d'œuvre talentueuse afin de créer et de diffuser une vaste gamme de contenus reflétant la diversité de notre monde.
Job Description
Role Summary
We are seeking a Reinforcement Learning Engineer with experience manipulating virtual environments to train autonomous agents. This role focuses on the design of robust simulation environments, reward structures, and policy architectures that can navigate complex, multi-sensor landscapes.
Responsibilities
- Cross-Functional Coordination: Work with partner ML and Annotation engineers and TPMs to spec out data, simulation, and training requirements.
- Environment Design: Build and maintain high-fidelity 2D/3D simulation environments (using tools like Unity, Unreal, or Isaac Sim) that serve as the training ground for RL agents.
- Reward Engineering: Design and tune complex reward functions that align agent behavior with product goals and safety constraints.
- Algorithm Implementation: Develop and optimize RL algorithms (e.g., PPO, SAC, or Offline RL) capable of handling high-dimensional 3D observation spaces.
- Sim-to-Real Strategy: Analyze the "reality gap" and implement domain randomization or adaptation techniques to ensure models perform reliably in real-world scenarios.
Résumé du poste
Nous recherchons une personne Ingénieur• e Forward Deployed en apprentissage par renforcement ayant de l'expérience dans la manipulation d'environnements virtuels pour entraîner des agents autonomes. Ce poste est axé sur la conception d'environnements de simulation robustes, de structures de récompense et d'architectures de politiques capables d'évoluer dans des environnements complexes et multi-capteurs
Responsabilités
- Coordination interfonctionnelle : Collaborer avec des personnes ingénieur• e• s en ML et en annotation ainsi qu'avec des TPM afin de définir les exigences en matière de données, de simulation et d'entraînement.
- Conception d'environnements : Concevoir et maintenir des environnements de simulation 2D/3D haute fidélité (à l'aide d'outils tels que Unity, Unreal ou Isaac Sim) servant de terrain d'entraînement pour les agents d'apprentissage par renforcement.
- Ingénierie des récompenses : Concevoir et ajuster des fonctions de récompense complexes afin d'aligner le comportement des agents avec les objectifs du produit et les contraintes de sécurité.
- Implémentation d'algorithmes : Développer et optimiser des algorithmes d'apprentissage par renforcement (p. ex., PPO, SAC ou RL hors ligne) capables de gérer des espaces d'observation 3D de grande dimension.
- Stratégie simulation-vers-réel : Analyser l'« écart de réalité » et mettre en œuvre des techniques de randomisation de domaine ou d'adaptation afin d'assurer des performances fiables des modèles en conditions réelles.
Qualifications
Basic Qualifications
- Education: Graduate degree (Master's or PhD) in Robotics, Computer Science, AI, or a related field with a focus on Reinforcement Learning, Imitation Learning, or other Online Machine Learning fields.
- Professional Experience: Proven experience as an RL Engineer or Research Engineer in a fast-paced environment.
- Industry Context: Prior experience in industries with complex multi-disciplinary teams such as robotics, smart grids, precision agriculture, game development, or aerospace.
- Technical Proficiency:
- Core Tools: Fluency with Python, Git, and the Unix shell.
- RL Frameworks: Deep familiarity with frameworks like Ray Rllib, Stable Baselines3, or CleanRL.
- Physics & 3D Engines: Experience with physics engines (MuJoCo, Bullet) or 3D game engines.
- Ecosystem: Familiarity with collaborative tools such as Jira/Confluence, Slack, a Git server, and an experiment tracking framework.
Desired Characteristics
- Strong Mathematical Background: Essential for understanding Markov Decision Processes (MDPs) and gradient-based optimization.
- High Attention to Detail: Critical for debugging non-deterministic agent behaviors and ensuring environment parity.
Eligibility Requirements
- Interested candidates must apply to be considered.
- Must be willing to work in our Montreal office a minimum of 4 days a week.
- Must be legally authorized to work in Canada.
- Must be willing to travel for work related business, if necessary
Qualifications de base
- Formation : Diplôme de deuxième ou troisième cycle (maîtrise ou doctorat) en robotique, informatique, IA ou dans un domaine connexe, avec une spécialisation en apprentissage par renforcement, apprentissage par imitation ou autres domaines d'apprentissage automatique en ligne.
- Expérience professionnelle : Expérience démontrée en tant que personne Ingénieur• e en apprentissage par renforcement ou Ingénieur• e de recherche dans un environnement dynamique.
- Contexte industriel : Expérience préalable dans des secteurs regroupant des équipes multidisciplinaires complexes, tels que la robotique, les réseaux intelligents, l'agriculture de précision, le développement de jeux ou l'aérospatiale.
- Compétences techniques :
- Outils de base : Maîtrise de Python, Git et de l'environnement shell Unix.
- Cadres RL : Solide expertise avec des frameworks tels que Ray Rllib, Stable Baselines3 ou CleanRL.
- Moteurs physiques et 3D : Expérience avec des moteurs physiques (MuJoCo, Bullet) ou des moteurs de jeux 3D.
- Écosystème : Familiarité avec des outils collaboratifs tels que Jira/Confluence, Slack, un serveur Git et un cadre de suivi des expériences.
Atouts souhaités
- Les personnes intéressées doivent soumettre leur candidature pour être considérées.
- Doit être disposé• e à travailler dans notre bureau de Montréal au minimum 4 jours par semaine.
- Doit être légalement autorisé• e à travailler au Canada.
- Doit être disposé• e à voyager pour des besoins professionnels, au besoin
Exigences d'admissibilité
- Les personnes intéressées doivent soumettre leur candidature afin d'être considérées.
- Doit être disposé• e à travailler dans nos bureaux de Montréal au minimum quatre (4) jours par semaine.
- Doit être légalement autorisé• e à travailler au Canada.
- Doit être disposé• e à se déplacer pour des raisons professionnelles, au besoin.
Additional Information
As part of our selection process, external candidates may be required to attend an in-person interview with an NBCUniversal employee at one of our locations prior to a hiring decision. NBCUniversal's policy is to provide equal employment opportunities to all applicants and employees without regard to race, color, religion, creed, gender, gender identity or expression, age, national origin or ancestry, citizenship, disability, sexual orientation, marital status, pregnancy, veteran status, membership in the uniformed services, genetic information, or any other basis protected by applicable law.
If you are a qualified individual with a disability or a disabled veteran and require support throughout the application and/or recruitment process as a result of your disability, you have the right to request a reasonable accommodation. You can submit your request to [email protected].
Skills Required
- Graduate degree in Robotics, Computer Science, AI, or related field
- Proven experience as an RL Engineer or Research Engineer in a fast-paced environment
- Experience with 2D/3D simulation environments using Unity, Unreal, or Isaac Sim
- Familiarity with RL algorithms like PPO, SAC, or Offline RL
- Fluency with Python, Git, and the Unix shell
- Experience with physics engines (MuJoCo, Bullet) or 3D game engines
- Knowledge of collaborative tools like Jira/Confluence, Slack
- Strong mathematical background for Markov Decision Processes (MDPs)
NBCUniversal Compensation & Benefits Highlights
-
Parental & Family Support — Offerings include fertility and adoption assistance, caregiving resources, and paid parental leave for both primary and non‑primary caregivers, indicating a strong family focus. Company materials highlight family‑building and caregiver programs as core parts of the package for eligible roles.
-
Leave & Time Off Breadth — The U.S. package outlines vacation, company holidays, personal “myDays,” caregiving days, sick time, and bereavement leave, signaling breadth beyond standard PTO structures. This variety is emphasized across employer materials as a notable part of the offering.
-
Retirement Support — Automatic 401(k) enrollment with dollar‑for‑dollar matching on a defined portion of pay and access to financial planning resources underscore support for long‑term savings. Materials also note that plan specifics depend on eligibility, program, and employee group.
NBCUniversal Insights
What We Do
From film, television, news, theme parks, interactive media, and streaming, our people are at the center of it all. Here, we solve complex and business-critical problems. That’s why we’re looking for people to help us continue our evolution, imagining and delivering the most innovative and disruptive products and services through the latest tech advancements in the industry. Here you can develop solutions. You’ll develop solutions that allow engineers to broadcast live TV from the comfort of their homes. These solutions will enable the use of our collection of hundreds of thousands of distinct intellectual properties across our film, television and streaming brands. Here you can transform. You’ll make decisions and solve complex problems by leveraging insights that come from data, building AI to help enable solutions to optimize every aspect of our content eco-system. Here you can build. You’ll build emerging immersive technologies that are used to power the broadcasts and streaming of global events like the Super Bowl and Olympics. You can create secure, elastic cloud-based services connecting parts of our global platform ecosystem that effect tens of millions of viewers, consumers and businesses that consume and love NBCUniversal’s content. And while you design, build and architect your career, we have the culture to make sure you’re supported. Here you can work and still live your best life! We’re leaders in our fields. We hire smart people and trust them to get the job done. We are never too busy to develop a fellow colleague. We understand our goals – or we ask. When we see something that needs doing – we do it. We make data-driven decisions. We fiercely believe in our talent and their growth. If you're ready to make an impact, here you can.
Why Work With Us
For us, it's more than just a work life. It's a daily passion. We take great pride in our legacy. We find fun in the challenge. We collaborate and inspire others. We're always creating, always solving and always ahead of competition.
Gallery
NBCUniversal Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.

.jpg)



-01.jpg)

-01.jpg)













.jpg)



-01.jpg)

-01.jpg)

















