Senior MLOps Engineer - Personalisation

Posted 23 Days Ago
Be an Early Applicant
3 Locations
Remote
Senior level
Information Technology • Design
The Role
The Senior MLOps Engineer will automate and operationalize machine learning systems, enhance MLOps frameworks, and optimize ML serving infrastructure for personalized experiences.
Summary Generated by Built In

Beyond is a technology consultancy helping organizations thrive in a rapidly changing world.

We build, modernize, scale, and operationalize technology, creating Cloud and AI solutions to unlock productivity and drive customer growth.

Role Overview

  • We're looking for a highly experienced Senior MLOps Engineer to own the automation, scaling, and operational excellence of our machine learning systems. This role is the critical bridge between our data science/ML engineering teams and a high-availability production environment.
  • You will take existing pipelines and evolve them to be best-in-class, responsible for operationalising new models (like NBA, ranking, and LLM-based solutions) with agility and efficiency. Your primary goal is to create a seamless, reliable, and highly observable environment on GCP that empowers our Data Scientists and ML Engineers to iterate and deploy models faster. You will be expected to have created or significantly evolved MLOps frameworks in the past and be able to quantify the improvements you deliver (e.g., in deployment frequency, model performance monitoring, or system reliability).

What You'll Do:

  • Take ownership of and evolve our end-to-end ML lifecycle, from data ingestion and feature engineering pipelines to model training, deployment, and real-time serving.
  • Design, build, and manage robust, automated CI/CD/CT (Continuous Integration / Continuous Delivery / Continuous Training) pipelines specifically for ML models, integrating with existing CI/CD patterns.
  • Leverage the GCP ecosystem, especially Vertex AI Pipelines, Vertex AI Endpoints, and Vertex AI Model Registry, to create a standardised and efficient path to production.
  • Design and own a best-in-class observability framework for ML models in production. This includes implementing granular monitoring for model performance (accuracy, bias), data and concept drift, and operational health (latency, throughput, error rates).
  • Collaborate closely with Data Scientists and ML Engineers to understand their needs, building the tools and abstractions that create a seamless environment and accelerate their workflow.
  • Optimise ML serving infrastructure for low-latency, real-time personalisation requirements.
  • Partner with data engineering to ensure robust integration with feature stores and data sources (like BigQuery and Oracle).
  • Define and track key MLOps metrics to quantify and communicate improvements in system performance, model quality, and team velocity.

What We're Looking For

  • 7+ years of deep, hands-on experience in a dedicated MLOps or DevOps role with a strong focus on machine learning systems.
  • Proven experience building or evolving MLOps frameworks from the ground up, with clear examples of the improvements you delivered.
  • Expert-level knowledge of the GCP cloud stack, particularly Vertex AI (Pipelines, Endpoints, Training), BigQuery, Pub/Sub, and GKE.
  • Deep expertise in building and managing observability stacks for real-time ML systems (e.g., using tools like Prometheus, Grafana, ELK stack, or specialised platforms).
  • Proven experience operationalising LLM-based systems, including managing embedding generation pipelines, vector databases, and fine-tuning/deployment workflows.
  • Strong practical experience with Infrastructure as Code (IaC) tools (e.g., Terraform, Ansible).
  • Demonstrable expertise in building and managing complex CI/CD pipelines.
  • Proficiency in Python and experience with scripting for automation, infrastructure management, and building tooling for ML teams.
  • Strong understanding of containerisation (Docker, Kubernetes) and microservices architecture as it applies to ML model serving.

Nice to Have

  • Relevant Google Cloud certifications (e.g., Professional Machine Learning Engineer, Professional Cloud DevOps Engineer).
  • A BSc, MSc, or PhD in Computer Science, Engineering, or a related technical field.
  • Hands-on experience with Datadog, especially for monitoring ML systems and cloud infrastructure.
  • Familiarity with the specific deployment challenges of ranking, recommendation, or NBA models.
  • Experience with other ML platforms or tools (e.g., Kubeflow, MLflow).
  • Knowledge of networking and security principles within GCP.

Having been named among the Sunday Times Best 100 Companies, we believe culture plays a large role in what we offer as an organization. We actively promote diversity in all its forms across our Studios, and we proudly, passionately, and proactively strive to create a culture of inclusivity and openness for all our employees.

Beyond is committed to welcoming everyone, regardless of gender identity, orientation, or expression. Our mission is to remove exclusivity and barriers and encourage new thinking and perceptions in a space of belonging. It is not about race, gender, or age, it is about people. And without our people being their most creative and innovative selves, we are nothing.

Top Skills

Ansible
BigQuery
Datadog
Docker
Elk Stack
GCP
Gke
Grafana
Kubernetes
Prometheus
Pub/Sub
Python
Terraform
Vertex Ai
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
San Francisco, California
453 Employees
Year Founded: 2010

What We Do

Beyond is a design agency dedicated to helping brands make progress by improving their customer experiences. Through a mix of strategy, technology, and design thinking, their teams can create an array of CX solutions—from journey maps to websites to apps. For over a decade, Beyond has served clients as varied as Google, Montblanc, and Brompton.

Beyond is a part of the Next Fifteen Group, an AIM listed company. With £20M in annual revenue, the company employs a global, remote-first workforce centred around hubs in London, Portugal, New York, San Francisco, and Mexico City. They are fostering the next generation of diverse design talent through commitments to Flipside, a paid skills development programme in the UK, and BRIDGEGOOD, a US-based not-for-profit organisation. The company motto is: “Go Further

Similar Jobs

360Learning Logo 360Learning

Presales Engineer DACH

Artificial Intelligence • Cloud • Edtech • HR Tech • Sales • Software • Generative AI
Easy Apply
Remote
Spain
400 Employees

WeLocalize Logo WeLocalize

Shape the Future of AI — Portuguese Talent Hub

Machine Learning • Natural Language Processing
In-Office or Remote
35 Locations
2331 Employees

WeLocalize Logo WeLocalize

Shape the Future of AI — Ukrainian Talent Hub

Machine Learning • Natural Language Processing
In-Office or Remote
34 Locations
2331 Employees

WeLocalize Logo WeLocalize

Shape the Future of AI — Slovak Talent Hub

Machine Learning • Natural Language Processing
In-Office or Remote
35 Locations
2331 Employees

Similar Companies Hiring

Axle Health Thumbnail
Logistics • Information Technology • Healthtech • Artificial Intelligence
Santa Monica, CA
17 Employees
Scrunch AI Thumbnail
Software • SEO • Marketing Tech • Information Technology • Artificial Intelligence
Salt Lake City, Utah
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account