Deep Learning Software Engineer, Inference - New College Grad 2026

Reposted 9 Hours Ago
Be an Early Applicant
Santa Clara, CA
In-Office
120K-236K Annually
Entry level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
As a Deep Learning Software Engineer, you will optimize GPU-accelerated software for AI applications, collaborating on open-source frameworks, performance tuning, and enhancing inference libraries.
Summary Generated by Built In

NVIDIA seeks a Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and optimize the GPU-accelerated software that powers today’s most sophisticated AI applications. Our team is responsible for developing and maintaining high-performance open-source frameworks, which are at the forefront of efficient large-scale model serving and inference. You will play a central role in improving these platforms, facilitating smooth deployment and serving of groundbreaking language models.

You’ll work closely with the deep learning community to implement the latest algorithms for public release in inference frameworks. Your work will focus on identifying and driving performance improvements for state-of-the-art LLM and Generative AI models across NVIDIA accelerators, from datacenter GPUs to edge SoCs. You'll bring to bear open-source tools and plugins—including CUTLASS, OAI Triton, NCCL, and CUDA kernels—to implement and optimize model serving pipelines.

What you'll be doing:

  • Performance optimization, analysis, and tuning of DL models in various domains like LLM, Multimodal and Generative AI.

  • Scale performance of DL models across different architectures and types of NVIDIA accelerators.

  • Contribute features and code to NVIDIA’s inference libraries, vLLM and SGLang, FlashInfer and LLM software solutions.

  • Work with cross-collaborative teams across frameworks, NVIDIA libraries and inference optimization innovative solutions.

What we need to see:

  • Pursuing a Masters or PhD or equivalent experience in relevant field (Computer Engineering, Computer Science, EECS, AI).

  • C/C++ programming and software design skills. SW Agile skills are helpful and Python experience is a plus.

  • Experience with training, deploying or optimizing the inference of DL models in production is a plus.

  • Modeling, profiling, debug, and code optimization or architectural knowledge of CPU and GPU is a plus.

  • GPU programming experience (CUDA, OAI TRITON or CUTLASS) is a plus.

Ways to Stand out from The Crowd

  • Contribute to deep learning software projects, such as PyTorch, vLLM, and SGLang to drive advancements in the field.

  • Experience with Multi GPU Communications (NCCL, NVSHMEM)

With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our special engineering teams are growing fast. If you're a creative and autonomous engineer with a genuine passion for technology, we want to hear from you!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 120,000 USD - 189,750 USD for Level 2, and 148,000 USD - 235,750 USD for Level 3.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until December 5, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

C/C++
Cuda
Cutlass
Nccl
Oai Triton
Python
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

Tempus AI Logo Tempus AI

Regional Sales Manager

Artificial Intelligence • Big Data • Healthtech • Machine Learning • Analytics • Biotech • Generative AI
Remote or Hybrid
3 Locations
3775 Employees

BuildOps Logo BuildOps

Customer Success Manager

Cloud • Mobile • Software
Easy Apply
Hybrid
Los Angeles, CA, USA
300 Employees
70K-100K Annually

BuildOps Logo BuildOps

Customer Success Manager

Cloud • Mobile • Software
Easy Apply
Hybrid
Los Angeles, CA, USA
300 Employees
90K-110K Annually

ServiceNow Logo ServiceNow

Sr. Manager, Product Design, Technology Workflows

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
28000 Employees
188K-328K Annually

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account