Technology - ML Ops Engineer

Posted 8 Days Ago
Be an Early Applicant
Leeds, West Yorkshire, England, GBR
In-Office
Mid level
Pharmaceutical
The Role
The ML Ops Engineer will manage the operation of Machine Learning services on Azure, overseeing the MLOps lifecycle, ensuring reliability, scalability, and performance of models, and collaborating with the Data Science team to optimize production services.
Summary Generated by Built In

Role:                                    ML Ops Engineer

Location:                             We operate a hybrid schedule, meaning 2-3 days a week in the office based at Thorpe Park, Leeds. 

Salary:                                 £ DOE plus extensive benefits

Contract type:                    Permanent

Employment type:             Full time

Working hours:                   We work on a core hours principle. Our core hours are 09:30 - 16:00; you can work around these to suit you!


Do you want to work for the nation’s largest online pharmacy ensuring excellence for all our patients? We’re a market leader in the pharmacy world, with 25 years’ experience, helping over 1.8 million patients in England manage their NHS prescriptions from request through to delivery.  We are Great Place to Work certified as we consider colleague experience a top priority every day, and as a certified B Corp we also meet high standards of social and environmental responsibility. Our people are fundamental to our success and ensuring we achieve our vision to be a world leading, patient-centric digital healthcare provider. We are committed to continuing to develop a positive, open and honest working environment for all.

Our tech teams keep us running 24/7 to make sure all our patients get world class service. To support that, this role may include participation in an out-of-hours rota as required by the business. We operate fair scheduling process as well as additional compensation for all on call periods.

The ML Ops Engineer will drive the operation of production‑grade Machine Learning and LLM services on Azure, ensuring models run as reliable, scalable, and high‑performing systems. Owning the end‑to‑end MLOps/LLMOps lifecycle, the role leads on CI/CD, deployment automation, monitoring, and incident response.

Working closely with Data Science, this role turns models into robust production services, bringing strong governance, observability, and continuous optimisation to ensure fast, safe, and efficient delivery at scale.


Why you’ll love working with us

We believe great people deserve great support. That’s why we offer a benefits package designed to look after your health, finances, career and life outside work.

Financial security & rewards

·        Competitive contributory pension

·        Occupational sick pay

·        Long-service awards and refer-a-friend bonuses

·        Professional registration fees covered (GPhC, NMC, CIPD and more)

·        Cycle to Work and Green Car schemes (subject to eligibility)

Family-friendly

·        Enhanced maternity and paternity pay

·        Flexible hybrid working to help balance work and home life

Health & wellbeing

·        Private healthcare insurance at discounted rates (Aviva)

·        Employee Assistance Programme and in-house mental health support

·        Access to discounted gym memberships via Blue Light Card and benefits schemes

·        Regular health and wellbeing initiatives

Career growth

·        Strong commitment to CPD, training and professional development

Time off & flexibility

·        25 days’ annual leave, increasing with service

·        Buy and sell holiday scheme

Everyday perks & exclusive discounts

·        Blue Light Card and employee discount platform

·        Exclusive discounts at The Springs, Leeds

·        25% off health & beauty purchases

·        25% off Pharmacy2U Private Online Doctor services

Culture & community

·        Regular social events throughout the year


What you’ll be doing?

Production Deployment & Release Engineering

·        Design and operate CI/CD pipelines for ML models and LLM prompt‑flows, covering build, test, validation, deployment, and rollback

·        Own model registration and promotion across environments, ensuring traceability, governance, and auditability

·        Implement safe deployment strategies (e.g. blue/green, canary, champion/challenger)

·        Package and deploy containerised inference services and batch pipelines, ensuring repeatability and rapid rollback

Reliability Engineering (Day 2 Operations)

·        Run ML and LLM services as production‑grade systems, defining SLOs/SLIs, dashboards, and alerting

·        Lead incident response for runtime issues, including triage, mitigation, recovery, and post‑incident reviews

·        Develop and maintain operational runbooks covering restart, rollback, secret rotation, and safe‑mode scenarios

·        Improve service resilience and reduce MTTR through automation (e.g. self‑healing, retries, fallbacks, circuit breakers)

Observability (Service, Data, Model & Cost)

·        Implement monitoring for availability, latency, errors, resource usage, and job performance

·        Monitor data quality including freshness, volume, completeness, schema drift, and distribution changes

·        Monitor model performance, including drift and prediction distribution shifts, and track accuracy where labels exist

·        Instrument LLM services for token usage, latency, and safety signals, with clear visibility into cost, quotas, and risks

LLMOps: Lifecycle, Quality & Safety

·        Manage prompts and workflows as code, including versioning, code reviews, and automated regression testing

·        Own production configuration for LLM deployments, including model updates, limits, and safeguards

·        Partner with Data Science and Security to ensure robust safety practices, including PII protection and prompt‑injection testing

Security, Privacy & Governance

·        Implement secure access controls, identity management, and secrets handling aligned to best practice

·        Support production readiness through documentation, monitoring plans, cost models, and audit evidence

·        Ensure all changes follow structured governance, with clear traceability and reproducibility


Who are we looking for?

·        Strong Python engineering skills, with experience in ML frameworks such as scikit‑learn, PyTorch, or TensorFlow, and familiarity with experiment tracking

·        Comfortable working in regulated environments, with an understanding of privacy, auditability, change control, and handling sensitive data

·        Strong DevOps/SRE background, including CI/CD, Infrastructure as Code, monitoring and alerting, incident management, and reliability engineering

·        Hands‑on experience with containerisation using tools such as Docker and Kubernetes (e.g. AKS), including debugging, performance tuning, and working with container registries

·        Experience working with Azure, ideally including Azure Machine Learning (pipelines, registries, online and batch endpoints) and Azure Monitor or Log Analytics

·        Experience operationalising ML pipelines, including training, batch scoring, feature engineering workflows, and preventing training‑serving skew

·        Experience implementing safe deployment practices such as blue/green or canary releases, supported by automated validation

·        Understanding of data contracts, schema evolution, and data quality practices, with the ability to troubleshoot data drift and missing features


What happens next?

Please click apply and if we think you are a good match, we will be in touch to arrange an interview.

Applicants must prove they have the right to live in the UK.

All successful applicants will be required to undergo a DBS check.

Unsolicited agency applications will be treated as a gift.

#LI-OW1


Skills Required

  • Strong Python engineering skills
  • Experience in ML frameworks such as scikit-learn, PyTorch, or TensorFlow
  • Strong DevOps/SRE background
  • Hands-on experience with containerisation tools such as Docker and Kubernetes
  • Experience working with Azure, ideally including Azure Machine Learning
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Leeds
513 Employees
Year Founded: 1999

What We Do

Pharmacy2U is the UK's first and largest online pharmacy. We're a market leader in the pharmacy world, helping over 750,000 patients in England manage their NHS repeat prescriptions from request through to delivery. For over 20 years, we have used cutting-edge technology alongside our UK-based pharmacists, dispensing specialists, and customer care advisers to help improve the lives of patients. Our Leeds and Leicester -based dispensing facilities distribute over 1 million medication items to patients across England every month. KPMG’s Customer Excellence report recently revealed Pharmacy2U is the UK’s number one brand for experience and value, based on a survey of over 10,000 UK consumers. This accolade cements our position as the nation’s go-to online pharmacy and keeps us committed to further broadening our healthcare offering. Our goal is to make simple and convenient healthcare accessible to all. As well as our repeat prescriptions service, patients can also access: - A one-stop online healthcare shop featuring over 4,000 products - Confidential online GP consultations for a range of conditions - Enhanced pharmacy services such as the NHS New Medicine Service We're an ambitious company with big plans and we're looking for talented individuals to join our team and help us to continue excelling in our field. Our CEO Kevin Heath says, “come and join a team that truly lives by our company values of putting the patient at the heart of everything we do”. Want to help shape the future of online healthcare in the UK? Discover our latest vacancies today.

Similar Jobs

Wise Logo Wise

Head of KYC Operations - Wise Platform

Fintech • Mobile • Payments • Software • Financial Services
Hybrid
London, Greater London, England, GBR
9000 Employees

Dynatrace Logo Dynatrace

Operations Coordinator

Artificial Intelligence • Big Data • Cloud • Information Technology • Software • Big Data Analytics • Automation
Remote or Hybrid
Maidenhead, Berkshire, England, GBR
5600 Employees

FloQast Logo FloQast

Business Development Representative

Artificial Intelligence • Fintech • Software
Hybrid
London, England, GBR
800 Employees

Klaviyo Logo Klaviyo

Enterprise Sales Specialist - Customer Agent

Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
Easy Apply
Hybrid
London, Greater London, England, GBR
2400 Employees
60K-90K Annually

Similar Companies Hiring

Formation Bio Thumbnail
Artificial Intelligence • Big Data • Healthtech • Biotech • Pharmaceutical
New York, NY
140 Employees
Pfizer Thumbnail
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
New York, NY
121990 Employees
Cencora Thumbnail
Healthtech • Logistics • Pharmaceutical
Conshohocken, PA
51000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account