Senior Technical Product Manager – DGX Enterprise Infrastructure and Cloud-Native Operations

Posted 5 Days Ago
Be an Early Applicant
Santa Clara, CA, USA
In-Office
208K-380K Annually
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
The role involves defining and productizing the operational standards for enterprise AI infrastructure, focusing on seamless integration, management, and automation of DGX systems in private data centers.
Summary Generated by Built In

NVIDIA is seeking a world-class Senior Product Manager to architect for the operational future of Enterprise AI. While the NVIDIA DGX is the undisputed "Gold Standard" for AI performance, the enterprise on-premise environment presents an outstanding challenge: How do you make a 1,000-node private cluster feel as fluid, scalable, and invisible as the public cloud?  The mission is to deliver the "NVIDIA Experience" within the customer’s data center. In this role, own the software-defined blueprint that transforms raw DGX hardware into a high-availability, self-healing AI Factory!

What You’ll Be Doing:

 In this role, set the vision for the Enterprise Operational Gold Standard. You will define how the world’s most sophisticated companies deploy, manage, and scale their Enterprise AI Factories. 

  • Productize the On-Prem Lifecycle: Define the "Day 0 through Day 2" experience for DGX SuperPODs. Lead the development of products that handle everything from bare-metal provisioning and network fabric configuration to automated "one-click" firmware rollouts. 

  • Build the "Pit Crew" (Observability): Develop a definitive telemetry and diagnostic suite. When a job slows down in a private data center, your framework should provide the "one-click" answer—isolating a thermal throttle, a degraded InfiniBand rail, or a cabling fault instantly. 

  • Bridge Hardware to Kubernetes: Lead the integration of DGX systems into the cloud-native ecosystem. Ensure that enterprise-grade features like GPU partitioning (MIG), multi-node scaling, and niche scheduling are declarative and seamless. 

  • Standardize at Scale: You aren't just building scripts; but building APIs and Services. Your goal is to eliminate "management snowflakes," ensuring that every enterprise DGX deployment is standardized, repeatable, and resilient. 

  • Drive Predictive Operations: Move the needle from reactive maintenance to self-healing infrastructure. Thoughtfully define the features for automated health checks that keep the fleet at peak performance without manual intervention.  

What We Need to See:

  • Enterprise Data Center DNA: 12+ years  demonstrated ability in Product Management,,with specific around on-premise infrastructure, private cloud, or large-scale systems management. 

  • Bachelors Degree in Computer Science or related field or equivalent experience.

  • The "Platform-First" Approach: A track record of turning complex hardware operations into software-defined workflows. You understand that in the enterprise, Product = Hardware + Software + Operations. 

  • Cloud-Native Expertise: Expert-level understanding of Kubernetes operators, container orchestration, and how to translate physical hardware constraints into declarative code. 

  • Operational Scars: You’ve lived through the challenges of managing large-scale Linux fleets in air-gapped or restricted enterprise environments. You know what keeps SREs up at night. 

  • Technical Breadth: Deep familiarity with data center networking (InfiniBand/Ethernet), storage architectures, and the firmware-to-OS handshake. 

  • Leadership & Evolution: This is a high-visibility role at the intersection of multiple engineering fields. As you define the NVIDIA Datacenter Experience, you will be positioned on a direct leadership track, with the explicit expectation to transition into formal people management as the team expands. 

 

Ways to Stand Out from the Crowd: 

  • Automation Evangelist: You have experience with infrastructure-as-code (Ansible, Terraform, Pulumi) in a bare-metal context. 

  • AIOps Pioneer: You have a vision for using AI to manage AI—applying telemetry and machine learning to predict and prevent infrastructure failures. 

  • The NVIDIA Narrative: You believe the "Gold Standard" isn't just about speed—it's about the reliability and simplicity of the Automated Pit Crew. 

NVIDIA is widely considered one of the technology world’s most desirable employers. We have some of the world's most forward-thinking and hardworking people on our team. If you're creative and autonomous, we want to hear from you!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 208,000 USD - 327,750 USD for Level 5, and 240,000 USD - 379,500 USD for Level 6.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until May 15, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Skills Required

  • 12+ years of Product Management experience
  • Bachelor's Degree in Computer Science or related field
  • Expert-level understanding of Kubernetes and container orchestration
  • Experience with infrastructure-as-code in a bare-metal context
  • Deep familiarity with data center networking and storage architectures

NVIDIA Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about NVIDIA and has not been reviewed or approved by NVIDIA.

  • Equity Value & Accessibility Equity awards and a discounted ESPP are highlighted as core parts of total compensation, enabling employees to share in the company’s success. Stock-based compensation and the two-year lookback ESPP are consistently described as especially valuable.
  • Healthcare Strength Health coverage is portrayed as robust, with comprehensive medical, dental, and vision options alongside mental health support and on-site care resources. Employer HSA contributions and wellness perks reinforce the depth of the offering.
  • Retirement Support Retirement programs are depicted as strong, featuring a meaningful 401(k) match with Roth options and support for Mega Backdoor Roth contributions. These elements position long-term savings as a notable advantage of the total rewards package.

NVIDIA Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

FreeWheel Logo FreeWheel

Technical Program Manager

AdTech • Digital Media • Marketing Tech
Remote or Hybrid
California, USA
1249 Employees
186K-248K Annually

Atlassian Logo Atlassian

Principal Strategist, AI Sales Strategy, Consumption Pricing

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
In-Office or Remote
San Francisco, CA, USA
11000 Employees
149K-233K Annually

Cox Enterprises Logo Cox Enterprises

Search Engine Optimization Specialist

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
22-33 Hourly

Block Logo Block

Front Office Brokerage Operations Lead

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
In-Office or Remote
8 Locations
12000 Employees
136K-245K Annually

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York City, NY
100 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account