L2 Datacenter Support Engineer

Posted 5 Days Ago
Be an Early Applicant
Poznań, Województwo wielkopolskie, POL
Hybrid
Mid level
Software
The Role
As an L2 Datacenter Support Engineer, you will support high-performance AI infrastructure, manage InfiniBand fabrics, and troubleshoot various platform issues while collaborating with engineering teams.
Summary Generated by Built In
Company Description

Mirantis helps organizations ship code faster on public and private clouds. The company provides a public cloud experience on any infrastructure from the data center to the edge. With Lens and the Mirantis Cloud Native Platform, Mirantis empowers a new breed of Kubernetes developers by removing infrastructure and operations complexity and providing one cohesive cloud experience for complete app and devops portability, a single pane of glass, and automated full-stack lifecycle management with continuous updates.

Mirantis serves many of the world’s leading enterprises, including Adobe, DocuSign, Liberty Mutual, PayPal, Reliance Jio, Societe Generale, Splunk, and Volkswagen. Learn more at www.mirantis.com.

Job Description

We are looking for an experienced L2 Engineer to operate and support high-performance AI infrastructure platforms, including NVIDIA GPU clusters, InfiniBand fabrics, and Kubernetes-based IaaS environments.

This role focuses on deep infrastructure expertise, ensuring performance, scalability, and reliability of the platform layer that powers AI workloads — without being responsible for the workloads themselves.

You will play a key role in bare metal lifecycle management, advanced InfiniBand troubleshooting, and platform stability, working closely with engineering teams to operate cutting-edge infrastructure at scale.

Key responsibilities:

  • Troubleshoot and maintain InfiniBand fabrics, including performance tuning, link issues, and topology validation.
  • Act as the escalation point for L1 for complex infrastructure and hardware issues.
  • Own and maintain accurate infrastructure modeling, IPAM, and source-of-truth data in NetBox.
  • Own InfiniBand fabric management and advanced troubleshooting, utilizing Verity for configuration, monitoring, and optimization of high-performance interconnects. 
  • Diagnose and resolve issues across GPU servers, networking, storage, and Kubernetes platforms.
  • Perform deep hardware and system-level diagnostics (GPUs, PCIe, NICs, firmware, etc.).
  • Support Kubernetes platform stability (node health, networking, scheduling issues).
  • Contribute to automation of provisioning and operational workflows.
  • Lead incident response, root cause analysis (RCA), and post-incident improvements.
  • Collaborate with vendors and internal engineering teams on complex issues.
  • Support infrastructure upgrades, firmware management, and capacity expansion.

Qualifications

 

Required Skills & Experience:

  • 3–6+ years of experience in infrastructure operations, datacenter engineering, or cloud platforms.
  • Strong Linux systems expertise.
  • Hands-on experience with bare metal provisioning systems and lifecycle management.
  • Strong experience with InfiniBand networking (troubleshooting, performance, fabric management using UFM).
  • Experience with IPAM/DCIM tools such as NetBox and Ethernet network configuration and validation leveraging Verity.
  • Solid understanding of datacenter networking, storage, and hardware architecture.
  • Working knowledge of Kubernetes in production environments.
  • Strong troubleshooting skills across hardware and distributed systems.

Preferred qualifications:

  • Experience with NVIDIA GPU platforms and accelerated computing infrastructure.
  • Familiarity with automation tools (Terraform, Ansible, etc.).
  • Exposure to OpenStack (optional).
  • Experience with observability stacks (Prometheus, Grafana, ELK).

Success in this role:

  • Rapid resolution of complex infrastructure and networking issues.
  • High reliability and performance of InfiniBand and GPU infrastructure.
  • Scalable and efficient bare metal provisioning processes.
  • Strong contribution to automation and operational excellence.
  • Trusted escalation point and technical leader within the team.

Additional Information

We offer:

  • Work with an established Silicon Valley leader in the cloud infrastructure industry;
  • Work with exceptionally passionate, talented and engaging colleagues, helping Fortune 500 and Global 2000 customers implement next-generation cloud technologies;
  • Be a part of cutting-edge, open-source innovation;
  • Thrive in the high-energy environment of a young company where openness, collaboration, risk-taking, and continuous growth are valued;
  • Professional development and training;
  • Attend conferences and working groups;
  • Company outings, happy hours, hackathons, and tech talks;
  • Receive a competitive compensation package with a strong benefits plan.

We are a Leader for Container Management in G2 (#2 after AWS)!

Skills Required

  • 3-6+ years of experience in infrastructure operations, datacenter engineering, or cloud platforms.
  • Strong Linux systems expertise.
  • Hands-on experience with bare metal provisioning systems and lifecycle management.
  • Strong experience with InfiniBand networking.
  • Experience with IPAM/DCIM tools such as NetBox.
  • Solid understanding of datacenter networking and storage.
  • Working knowledge of Kubernetes in production environments.
  • Strong troubleshooting skills across hardware and distributed systems.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Campbell, CA
729 Employees
Year Founded: 1999

What We Do

We are dedicated to helping organizations increase developer productivity and ship code faster on public and private clouds. We provide a ZeroOps experience to remove the stress of managing cloud native infrastructure by combining software and automation tools with our cloud native expertise to deliver the industry's leading secure cloud platforms. Our capabilities allow us to provide a secure and reliable cloud native platform that includes validated FIPS-140-2 Encryption and DISA STIG ready capabilities. Who do we serve? We serve a wide range of industries, building on our extensive customer experience to provide distinct value in specific verticals including Financial Services, Government & Education, Healthcare, Manufacturing, and Telecommunications. Mirantis serves many of the world’s leading enterprises, including Adobe, DocuSign, Inmarsat, PayPal, Reliance Jio, Societe Generale, Splunk, and S&P Global. Learn more at www.mirantis.com.

Similar Jobs

Carbon Robotics Logo Carbon Robotics

Performance Quality Technician

Artificial Intelligence • Computer Vision • Hardware • Machine Learning • Robotics • Software • Agriculture
Easy Apply
Remote or Hybrid
26 Locations
300 Employees
75K-85K Annually

Zscaler Logo Zscaler

Account Executive

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Remote or Hybrid
Poland
8697 Employees

Capco Logo Capco

Product Manager

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Remote or Hybrid
Poland
6000 Employees

Capco Logo Capco

Consultant

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Remote or Hybrid
Poland
6000 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account