NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. We are now seeking a highly motivated Infrastructure, Tools & AI Engineering Manager to join our Ethernet Switching group, working on SONiC Network OS. In this role, you will own and drive the engineering infrastructure that powers the full product development lifecycle — from development environments and CI pipelines through regression, code coverage, and test efficiency. You will apply cutting-edge AI and LLM capabilities to transform how we analyze failures, generate test coverage, and accelerate product quality.
What you’ll be doing:
Lead and mentor a team of infrastructure and tooling engineers; set technical direction, define priorities, and grow team capabilities
Design, build, and maintain scalable infrastructure for development, integration, and test environments supporting SONiC OS.
Architect and deliver LLM-based tools for intelligent regression analysis — failure classification, root cause clustering, anomaly detection, and test flakiness prediction
Lead efforts to reduce regression runtime through parallelization, smart test selection, and dependency-aware scheduling
Develop deep technical knowledge of SONiC Network OS internals, including its subsystem architecture, SAI/ASIC abstraction layer, and management plane
What we need to see:
B.Sc. degree or equivalent experience in Engineering/Computer Science/related field
8+ overall years of software engineering experience, with at least 3 years of experience in a leadership role, managing software development teams
Proven ability to lead technical teams: hiring, mentoring, technical roadmapping, and cross-team influence
Experienced with developing software testing tools and tests infrastructure
Strong Python programming skills; experience building production-quality automation frameworks and tooling
Demonstrated experience designing and operating CI/CD systems at scale (Jenkins, GitLab CI, GitHub Actions, or equivalent)
Hands-on experience with LLMs or AI-assisted developer tooling — building, integrating, or productizing AI capabilities in an engineering workflow
Strong analytical and problem-solving skills with a bias toward measurable outcomes and data-driven decisions
Ways to stand out from the crowd:
Deep Linux expertise: system internals, networking stack, process management, and scripting
Prior experience building LLM-powered test analysis pipelines or AI-enhanced DevOps tooling in a real production environment
Knowledge of networking protocols and hardware: Ethernet switching, L2/L3 protocols, QoS, VLANs, high-performance data center networking
Experience with code coverage instrumentation in large-scale C/Python codebases and using coverage data for test prioritization
Track record of measurably improving regression runtime, test reliability, or CI throughput in a complex embedded or systems software environment
Skills Required
- B.Sc. degree or higher in Computer Science, Software Engineering, or related field
- 8+ years of software engineering experience
- 3+ years in infrastructure, DevOps or tooling leadership role
- Strong Python programming skills
- Experience designing and operating CI/CD systems at scale
- Hands-on experience with LLMs or AI-assisted developer tooling
- Proven ability to lead technical teams
- Strong analytical and problem-solving skills
NVIDIA Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about NVIDIA and has not been reviewed or approved by NVIDIA.
-
Equity Value & Accessibility — Equity awards and a discounted ESPP are highlighted as core parts of total compensation, enabling employees to share in the company’s success. Stock-based compensation and the two-year lookback ESPP are consistently described as especially valuable.
-
Healthcare Strength — Health coverage is portrayed as robust, with comprehensive medical, dental, and vision options alongside mental health support and on-site care resources. Employer HSA contributions and wellness perks reinforce the depth of the offering.
-
Retirement Support — Retirement programs are depicted as strong, featuring a meaningful 401(k) match with Roth options and support for Mega Backdoor Roth contributions. These elements position long-term savings as a notable advantage of the total rewards package.
NVIDIA Insights
What We Do
NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”







