Research Engineer - Contextual Bandits & RL

Reposted 19 Days Ago
Be an Early Applicant
London, Greater London, England, GBR
In-Office
Mid level
Artificial Intelligence • Machine Learning • Retail • Software
The Role
The Research Engineer will develop offline contextual bandit and reinforcement learning models for hyper-personalisation in retail, focusing on logged interaction data.
Summary Generated by Built In

About Us

We are a VC-backed startup focused on hyper-personalisation, currently in stealth. Inspired by the latest in recommender systems, we leverage transformers and graph learning alongside decision-making models to build the most engaging customer experiences for in-store retail.

Our mission is to change retail forever through hyper-personalised experiences that are both simple and beautiful.

About the Role - Offline Contextual Bandits and RL for Hyper-personalisation

We are looking for a Research Engineer to build decision-making models for in-store hyper-personalisation, with an initial focus on learning from logged human interaction data in an offline setting. You will work closely with domain experts and engineers to develop contextual bandit and reinforcement learning approaches that can support both single-step decisions and multi-step customer journeys, with the potential to enable online learning over time.

Key Responsibilities

  • Develop and productionise offline contextual bandit and offline RL methods that learn from logged interaction data.

  • Build rigorous off-policy evaluation (OPE) and counterfactual validation to measure candidate policies offline and compare approaches reliably.

  • Formulate and model both single-step decisions (contextual bandits) and multi-step decision processes (sequential / RL style settings) based on real retail interactions.

  • Advance representation learning for decision-making, including using transformers and GNNs where appropriate for behavioural, relational, and sequential data.

  • Translate research ideas into robust systems: dataset design, modelling, evaluation, deployment, monitoring, and iteration.

  • Collaborate cross-functionally to turn ambiguous product goals into concrete ML objectives, experiments, and deliverables.

Essential Qualifications

  • 3 to 5+ years applying machine learning research in production settings.

  • MSc in Computer Science, Machine Learning, or a closely related field (or equivalent experience).

  • Strong foundations in machine learning and deep learning, including experience with at least one of: contextual bandits, reinforcement learning, counterfactual learning, ranking, or recommender systems.

  • Excellent Python skills and experience developing and debugging production-level code.

  • Ability to reason about evaluation methodology and failure modes when learning from logged interaction data.

Desired Skills (Bonus Points)

  • Demonstrated experience with offline policy learning and evaluation methods (for example IPS style estimators and doubly-robust approaches, plus uncertainty estimation).

  • Familiarity with bandit algorithms and exploration strategies, with interest in enabling online learning when the product is ready.

  • Experience with recommenders and ranking (candidate generation, reranking, slates).

  • Experience building data pipelines and improving data quality in modern ML environments.

  • PhD in a relevant field.

What We Offer

  • Opportunity to build technology that will transform millions of shopping experiences.

  • Real ownership and impact in shaping product and company direction.

  • A dynamic, collaborative work environment with cutting-edge ML challenges.

  • Competitive compensation and equity in a rapidly growing company.

If you’re excited by the idea of shaping the future of retail and eager to make a real impact from day one, we’d love to hear from you.

Skills Required

  • 3 to 5+ years applying machine learning research in production settings
  • MSc in Computer Science, Machine Learning, or a closely related field
  • Strong foundations in machine learning and deep learning
  • Excellent Python skills and experience developing and debugging production-level code
  • Ability to reason about evaluation methodology and failure modes
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
0 Employees

What We Do

We’re a VC-backed stealth-mode company building behavioural AI solutions for the retail industry, focused on hyper-personalisation. Our mission is to bring the intelligence of modern machine learning directly to the in-store shopping experience.

Similar Jobs

Mastercard Logo Mastercard

Director, Risk Management - Commercial & New Payment Flows (CNPF)

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Hybrid
London, Greater London, England, GBR
38800 Employees

Mastercard Logo Mastercard

RF#179 - Director, Services Business Development - MAIP

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Hybrid
London, Greater London, England, GBR
38800 Employees

Atlassian Logo Atlassian

Sales Executive

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
In-Office or Remote
London, Greater London, England, GBR
11000 Employees

Airwallex Logo Airwallex

Growth Manager (Relocation to Singapore)

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
In-Office
London, Greater London, England, GBR
2200 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account