AI Infrastructure Engineer / AI Infrastructure Consultant

Posted 7 Days Ago
Be an Early Applicant
Mumbai, Maharashtra, IND
Hybrid
Senior level
Information Technology • Consulting
The Role
Design, harden, and operate production-grade AI infrastructure and Kubernetes platforms for training, evaluation, and inference. Build MLOps pipelines (versioning, CI/CD, monitoring, rollback), enforce security-first designs, deploy reliable enterprise systems, and accelerate development using AI coding agents while partnering with ML and application teams.
Summary Generated by Built In
Company Description

Version 1 has celebrated 30 years in business and continues to be trusted by global brands to deliver technology and transformation solutions that drive customer success. Our deep expertise enables our customers to navigate the rapidly evolving technology landscape. We foster strong partnerships with global technology leaders including Microsoft, AWS, Oracle, Red Hat, OutSystems, Snowflake, ensuring that our customers are provided with the highest quality solutions and services. 

 We’re an award-winning employer reflecting how our employees are at the very heart of  what we do: 

  • UK & Ireland's premier AWS, Microsoft & Oracle partner 
  • 3300+ strong, €350/£300m revenue business 
  • 10+ years as a Great Place to Work in Ireland & UK 
  • Best Workplace for Women in the UK & Ireland by GPTW 
  • Best Workplace for Wellbeing in the UK by GPTW 

We’re a core values driven company, we hire people who share our values, and we reward those who display and foster them, it’s deeply embedded within our DNA. Invest in us and we’ll invest in you!. 

Job Description

About the Role

We are building a production-grade AI platform designed to support advanced machine learning systems at scale. We are seeking a senior infrastructure engineer or AI infrastructure consultant who combines deep platform expertise with hands-on experience using modern AI coding agents to accelerate development.

This role is ideal for someone who has architected and operated enterprise AI systems, understands modern MLOps and cloud-native infrastructure, and actively leverages AI-assisted development tools to move faster without sacrificing reliability, security, or performance.

You will help design, harden, and scale the infrastructure backbone supporting AI workloads across training, inference, and application layers.

What You Will Do

  • Architect and implement scalable infrastructure for AI and ML workloads (training, evaluation, inference).
  • Design and operate Kubernetes-based platforms for multi-tenant, production AI systems.
  • Build and refine MLOps pipelines covering model versioning, experiment tracking, CI/CD, deployment, monitoring, and rollback.
  • Establish DevOps best practices across infrastructure, application, and ML layers.
  • Lead security-first infrastructure design (access control, secrets management, isolation, observability, auditability).
  • Deploy and operate enterprise-grade production systems with strong uptime and reliability standards.
  • Leverage modern AI coding agents and developer copilots to accelerate engineering workflows.
  • Partner with ML engineers and application teams to translate research and product requirements into scalable infrastructure capabilities.

Qualifications

What We Are Looking For

  • 8-12+ years of experience in infrastructure, platform engineering, or distributed systems.
  • Proven experience building and operating enterprise-grade production systems.
  • Deep hands-on expertise with Kubernetes in production (autoscaling, networking, upgrades, reliability patterns).
  • Strong background in MLOps and ML platform lifecycle management.
  • Experience with cloud platforms (AWS, GCP, or Azure) and Infrastructure-as-Code (Terraform, Pulumi, etc.).
  • Practical, hands-on use of AI coding agents / AI-assisted development tools.
  • Strong programming ability in Go, Python, or similar infrastructure-oriented languages.

Nice to Have

  • Experience supporting GPU workloads and large-scale training/inference.
  • Familiarity with enterprise security standards (SOC2, ISO, zero-trust architectures).
  • Experience building internal developer platforms serving multiple teams.
  • Background supporting AI systems in regulated or high-reliability environments.

Additional Information

Why Version 1? 

 At Version 1, we believe in providing our employees with a comprehensive benefits package that prioritises their wellbeing, professional growth, and financial stability. 

  • Share in our success with our Quarterly Performance-Related Profit Share Scheme, where employees collectively benefit from a share of our company's profits 
  • Strong Career Progression & mentorship coaching through our Strength in Balance & Leadership schemes with a dedicated quarterly Pathways Career Development programme 
  • Flexible/remote working, Version 1 is tremendously understanding of life events and people’s individual circumstances and offer flexibility to help achieve a healthy work life balance 
  • Financial Wellbeing initiatives including; Pension, Private Healthcare Cover, Life Assurance, Financial advice and an Employee Discount scheme 
  • Employee Wellbeing schemes including Gym Discounts, Bike to Work, Fitness classes, Mindfulness Workshops, Employee Assistance Programme and much more. Generous holiday allowance, enhanced maternity/paternity leave, marriage/civil partnership leave and special leave policies 
  • Educational assistance, incentivised certifications, and accreditations, including AWS, Microsoft, Oracle, and Red Hat 
  • Reward schemes including Version 1’s Annual Excellence Awards & ‘Call-Out’ platform. 
  • Environment, Social and Community First initiatives allow you to get involved in local fundraising and development opportunities as part of fostering our diversity, inclusion and belonging schemes. 

And many more exciting benefits… drop us a note to find out more.    

Version 1 is an equal opportunities employer. 

We are committed to building a diverse, inclusive and respectful workplace where everyone feels valued and able to thrive. We welcome applications from people of all backgrounds, identities and lived experiences, and we value the different perspectives people bring including those shaped by disability and neurodiversity. 

We want every candidate to have a positive and accessible recruitment experience. If you need reasonable adjustments at any stage of the process, please contact your recruiter at Version 1. We will consider all requests carefully, respectfully and confidentially. 

Video links: https://www.youtube.com/watch?v=F_d3ELTH5zo

Skills Required

  • 8-12+ years experience in infrastructure, platform engineering, or distributed systems
  • Proven experience building and operating enterprise-grade production systems
  • Deep hands-on expertise with Kubernetes in production (autoscaling, networking, upgrades, reliability patterns)
  • Strong background in MLOps and ML platform lifecycle management
  • Experience with cloud platforms (AWS, GCP, or Azure)
  • Experience with Infrastructure-as-Code (Terraform, Pulumi, etc.)
  • Practical, hands-on use of AI coding agents / AI-assisted development tools
  • Strong programming ability in Go, Python, or similar infrastructure-oriented languages
  • Experience supporting GPU workloads and large-scale training/inference
  • Familiarity with enterprise security standards (SOC2, ISO, zero-trust architectures)
  • Experience building internal developer platforms serving multiple teams
  • Background supporting AI systems in regulated or high-reliability environments
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Dublin
3,000 Employees
Year Founded: 1996

What We Do

Version 1 proves that IT can make a real difference to our customers'​ businesses. We are trusted by global brands to deliver IT services and solutions which drive customer success. Our 3000+ strong team works closely with our technology partners to provide independent advice that helps our customers navigate the rapidly changing world of IT. Our greatest strength is balance in our efforts to achieve Customer Success, Empowered People and a Strong Organisation, underpinned by the commitment to our values. We believe this is what makes Version 1 different and more importantly, our customers agree.

Similar Jobs

Navixus | Tech Mahindra Logo Navixus | Tech Mahindra

Dynatrace system Engineer

Artificial Intelligence • Natural Language Processing • Professional Services • Analytics • Consulting • Conversational AI • Generative AI
Hybrid
Mumbai, Maharashtra, IND
830 Employees

ZS Logo ZS

Consultant

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
Hybrid
2 Locations
15000 Employees

ZS Logo ZS

Advanced Data Science Associate

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
Hybrid
4 Locations
15000 Employees

Pfizer Logo Pfizer

Senior Healthcare Executive

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
In-Office
Mumbai, Maharashtra, IND
121990 Employees

Similar Companies Hiring

Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account