Infrastructure Engineer (Infiniband)

Sorry, this job was removed at 08:20 p.m. (CST) on Wednesday, Sep 10, 2025
2 Locations
In-Office or Remote
140K-180K Annually
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
We build infrastructure for machine learning
The Role

We are seeking an Infrastructure Engineer with a focus on InfiniBand/NCCL to join our Infrastructure Engineering team. Our engineers design and build automation, tooling, and systems that bridge the gap between physical infrastructure and the platforms that power large-scale AI/ML and HPC workloads.

This role combines the breadth of a core infrastructure engineer with a specialty in high-performance networking and GPU communication. You’ll help ensure our InfiniBand fabric and NCCL stack are tuned, reliable, and efficient at scale — supporting some of the world’s largest GPU clusters.

This is a fully remote position, although candidates must be based in the continental United States. Unfortunately, we are unable to provide sponsorship for this role.

Responsibilities
  • Design, build, and maintain automation, APIs, and frameworks to manage physical infrastructure at scale.

  • Develop and extend systems for server lifecycle management.

  • Implement and tune InfiniBand networking and NCCL configurations for multi-GPU communication.

  • Collaborate with Network, Platform, and Infrastructure Operations teams to support new infrastructure rollouts.

  • Diagnose and improve performance across GPU, NVSwitch, PCIe, and InfiniBand layers.
    Write clear design documents and technical documentation to capture best practices.

Qualifications
  • 8+ years of professional experience in infrastructure engineering, HPC, or related domains.

  • Strong experience with Linux in production environments.

  • Proficiency in Python or similar languages for automation.

  • Deep understanding of InfiniBand networking (CX7 HCAs, fabrics, partitioning, GPUDirect).

  • Familiarity with NCCL, CUDA, and GPU topology optimization.

  • Knowledge of containerization and orchestration concepts.

  • Strong written and verbal communication skills.

Ideal Experiences
  • Experience with Dell PowerEdge XE9680 or other GPU-dense servers.

  • Prior work with NVIDIA H100s, NVSwitch, and large-scale NCCL testing.

  • Familiarity with Mellanox OFED, UCX, and Redfish/iDRAC for management.

  • Broader experience across infrastructure areas (storage, virtualization, networking).

Culture
  • Enjoy collaborating with a motivated, execution-focused team.

  • Comfortable operating with autonomy while aligning to company objectives.

  • Value precision, documentation, and knowledge-sharing.

  • Excited to grow as both a domain specialist (InfiniBand/NCCL) and a generalist infrastructure engineer.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic protected by law.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Compensation Range: $140K - $180K


#BI-Remote

Similar Jobs

Wipfli Logo Wipfli

Human Resources Business Advisor

Cloud • Fintech • Software • Business Intelligence • Consulting • Financial Services
Remote or Hybrid
United States
3000 Employees
66K-89K Annually

Wipfli Logo Wipfli

Transaction Advisory Services Manager

Cloud • Fintech • Software • Business Intelligence • Consulting • Financial Services
Remote or Hybrid
United States
3000 Employees
117K-158K Annually

Wipfli Logo Wipfli

Chief Financial Officer

Cloud • Fintech • Software • Business Intelligence • Consulting • Financial Services
Remote or Hybrid
United States
3000 Employees
125K-185K Annually

Wipfli Logo Wipfli

Director - Transaction Advisory Services

Cloud • Fintech • Software • Business Intelligence • Consulting • Financial Services
Remote or Hybrid
United States
3000 Employees
142K-191K Annually
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
135 Employees
Year Founded: 2023

What We Do

The market for cutting-edge ML compute is broken. Startups, researchers and even big AI labs are scrambling to buy or rent access to the latest chips for ML training. But demand far outstrips supply, and what’s available is only accessible to the well-resourced, placing an artificial damper on innovation.

To solve this challenge, we've launched Voltage Park, and we’re on a mission to make machine learning infrastructure accessible to all, from large enterprises and research universities, to seed-stage startups and nonprofits.

With around 24,000 NVIDIA H100 GPUs, the Voltage Park cloud is one of the most powerful collections of cutting-edge ML compute in the world. Our clusters consist of 80GB H100 SXM5 GPUs fully interconnected with 3.2T InfiniBand.

Why Work With Us

You’ll play a pivotal role as a member of the founding team that will change the face of machine learning infrastructure. As an early hire, you’ll have outsize influence in defining the company’s culture and ensuring mission success.

Gallery

Gallery

Similar Companies Hiring

Granted Thumbnail
Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account