Senior Software Engineer, Cloud Functions

Posted 11 Days Ago
2 Locations
In-Office or Remote
184K-357K Annually
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
Design and implement scalable software for GPU/CPU diagnostics and repairs across cloud infrastructures, leading impactful AI projects.
Summary Generated by Built In

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions, from artificial intelligence to autonomous cars. NVIDIA is looking for great people like you to help us accelerate the next wave of artificial intelligence.

The team delivers NVIDIA Mission Control Software that runs on superpods. The software we develop is shipped as an autonomous hardware recovery engine and is responsible for baseline validation tests, taking remedial actions (break/fix workflows), and periodic health checks for hardware components. We are looking for a Senior Software Engineer with experience in building highly scalable and robust enterprise software to join us. We are building and improving a powerful platform that will automate the diagnosis and repair of a cluster of GPUs or CPUs across public clouds, private clouds, and virtual and physical hardware.

What you'll be doing:

  • Designing and implementing scalable and reliable software components to enable the core platform to maintain an inventory of resources, including hosts, GPUs, and switches; to automate actions to diagnose failures, and to repair

  • Enabling Agentic AI within the core platform to create remedial workflows

  • Influencing the product roadmap in collaboration with teams across various departments with the goal of reducing SRE toil and improving hardware utilization

  • Collaborating with various organizations across Nvidia to drive adoption of the platform in order to improve GPU utilization

  • Defining and running benchmarks for various subsystems

  • Leading and delivering high-impact projects with high quality, performance, and stability with the lowest resource consumption

  • Developing a robust feedback control system that analyzes signals about system health and automatically runs commands to fix discovered issues

  • Programming in modern languages like Go and Rust

What we need to see:

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience)

  • Keen interest in driving Agent AI projects

  • 10 years of equivalent experience

  • Demonstrated ability in building scalable and robust distributed systems

  • Proven record of product rollouts and collaborating with early adopters

  • Proficiency in programming in C/C++, Java, Rust or Go.

  • Technical stewardship of projects across the organization

Ways to stand out from the crowd:

  • Deep understanding of multi-threading and distributed systems concepts

  • Excellent track record of delivering projects

  • Expertise in optimizing SQL queries

  • Expert-level knowledge of Go/Rust programming

With competitive salaries and a generous benefits package, NVIDIA is widely considered to be one of the technology industry's most desirable employers. We have some of the most forward-thinking and versatile people in the world working with us, and our engineering teams are growing fast in some of the most impactful fields of our generation: Cloud Engineering and Cloud Functions. If you're a creative engineer who enjoys autonomy and shares our passion for technology, we want to hear from you.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and dedicated people in the world working for us. If you're creative and passionate about developing cloud services we want to hear from you!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until November 3, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

C/C++
Go
Java
Rust
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

MetLife Logo MetLife

Product Owner

Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Remote or Hybrid
United States
43000 Employees
90K-128K Annually

MetLife Logo MetLife

Senior UX Strategist

Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Remote or Hybrid
United States
43000 Employees
83K-135K Annually

MetLife Logo MetLife

Pet Claims Coordinator

Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Remote or Hybrid
United States
43000 Employees
42K-42K Annually

Atlassian Logo Atlassian

Principal Product Manager

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
In-Office or Remote
San Francisco, CA, USA
11000 Employees
186K-293K Annually

Similar Companies Hiring

Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account