Senior High-Performance LLM Training Engineer

Reposted 14 Days Ago
Be an Early Applicant
Santa Clara, CA
In-Office
184K-357K Annually
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
The Senior High-Performance LLM Training Engineer optimizes LLM training workloads on GPUs, implements software in NVIDIA's deep learning platform, and automates workload analysis.
Summary Generated by Built In

We are now looking for a Senior High-Performance LLM Training Engineer!

NVIDIA is seeking experienced engineers specializing in performance analysis and optimization to improve the efficiency of LLM training workloads, which are shaping the world's most advanced computing systems. This position focuses on optimizing NVIDIA’s high-performance LLM software stack in frameworks like PyTorch and JAX for high-performance training on thousands of GPUs, while also helping shape hardware roadmaps for the next generation of GPUs powering the AI revolution.

What you will be doing:

  • Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms.

  • Understand the big picture of training performance on GPUs, prioritizing and then solving problems across all state-of-the-art neural networks.

  • Implement production-quality software in multiple layers of NVIDIA's deep learning platform stack, from drivers to DL frameworks.

  • Build and support NVIDIA submissions to the MLPerf Training benchmark suite.

  • Implement key DL training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.

  • Build tools to automate workload analysis, workload optimization, and other critical workflows.

What we want to see:

  • PhD in Computer Science, Electrical Engineering or Computer Engineering and 5+ years; or MS (or equivalent experience) and 8+ years of meaningful work experience.

  • Strong background in deep learning and neural networks, in particular training.

  • A deep background in computer architecture and familiarity with the fundamentals of GPU architecture.

  • Proven experience analyzing and tuning application performance & processor and system-level performance modelling.

  • Programming skills in C++, Python, and CUDA.

GPU computing is the most productive and pervasive platform for deep learning and AI. It begins with the most advanced GPUs and the systems and software we build on top of them. We integrate and optimize every deep learning framework. We work with the major systems companies and every major cloud service provider to make GPUs available in data centers and in the cloud. We craft computers and software to bring AI to edge devices, such as self-driving cars and autonomous robots. AI has the potential to spur a wave of social progress unmatched since the industrial revolution.

Widely considered to be one of tech's most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. Additionally, this opportunity offers you the ability to collaborate with some of the most forward-thinking and hard-working people in the world, shaping the future of AI in a creative and autonomous work environment that encourages innovation. If you're excited to work across the full hardware & software stack—from GPU architecture to application code—to achieve optimal performance, we want to hear from you!

#LI-Hybrid

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until July 29, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

C++
Cuda
Jax
Python
PyTorch
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

Klaviyo Logo Klaviyo

Senior Director, G&A Systems

Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
Hybrid
San Francisco, CA, USA
2400 Employees
208K-312K Annually

ServiceNow Logo ServiceNow

Business Operations Analyst

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
28000 Employees
89K-89K Annually

ServiceNow Logo ServiceNow

Product Manager

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
28000 Employees
164K-286K Annually

ServiceNow Logo ServiceNow

Director, Outbound Product Management, Public Sector Industry AI Products

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
28000 Employees
218K-381K Annually

Similar Companies Hiring

Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account