Senior Deep Learning Systems Software Engineer - AI Infrastructure

Job Posted 13 Days Ago Posted 13 Days Ago
Be an Early Applicant
2 Locations
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
The role involves optimizing deep learning workloads, building analysis tools, and collaborating on cloud application performance on GPU architectures.
Summary Generated by Built In

NVIDIA is an industry leader with groundbreaking developments in High-Performance Computing, Artificial Intelligence and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is seeking senior engineers who are mindful of performance analysis and optimization to help us squeeze every last clock cycle out of all facets of Deep Learning such as training and inferencing, one of today's most important workloads in the world. If you are unafraid to work across all layers of the hardware/software stack from GPU architecture to Deep Learning Framework to achieve peak performance, we want to hear from you! This role offers an opportunity to directly impact the hardware and software roadmap in a fast-growing technology company that leads the AI revolution while helping deep learning users around the globe enjoy ever-higher training speeds.

What you'll be doing:

  • Understand, analyze, profile, and optimize deep learning workloads on state-of-the-art hardware and software platforms.

  • Build tools to automate workload analysis, workload optimization, and other critical workflows.

  • Collaborate with cross-functional teams to analyze and optimize cloud application performance on diverse GPU architectures.

  • Identify bottlenecks and inefficiencies in application code and propose optimizations to enhance GPU utilization.

  • Drive end-to-end platform optimization from a hardware level to the application and service levels

  • Design and implement performance benchmarks and testing methodologies to evaluate application performance.

  • Provide guidance and recommendations on optimizing cloud-native applications for speed, scalability, and resource efficiency.

  • Share knowledge and best practices with domain expert teams as they transition applications to distributed environments.

What we need to see:

  • Masters in CS, EE or CSEE or equivalent experience

  • 5+ years of experience in application performance engineering

  • Experience using large scale multi node GPU infrastructure on premise or in CSPs

  • Background in deep learning model architectures and experience with Pytorch and large scale distributed training

  • Experience with application profiling tools such as NVIDIA NSight, Intel VTune etc.

  • Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture. Experience with NVIDIA's Infrastructure and software stacks.

  • Proven experience analyzing, modeling and tuning DL application performance.

  • Proficiency in Python and C/C++ for analyzing and optimizing application code

Ways to stand out from the crowd:

  • Strong fundamentals in algorithms and GPU programming experience (CUDA or OpenCL)

  • Understanding of NVIDIA's server and software ecosystem

  • Hands-on experience in performance optimization and benchmarking on large-scale distributed systems

  • Hands-on experience with NVIDIA GPUs, HPC storage, networking, and cloud computing.

  • In-depth understanding storage systems, Linux file systems, RDMA networking

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you.

Top Skills

C/C++
Cuda
Intel Vtune
Nvidia Nsight
Opencl
Python
PyTorch
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
On-site Workplace
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

ServiceNow Logo ServiceNow

Technical Architect

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Hybrid
Mumbai, Maharashtra, IND

ServiceNow Logo ServiceNow

Principal Solution Architect

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Hybrid
Mumbai, Maharashtra, IND

ZS Logo ZS

Business Technology Solutions Associate- IICS

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
Hybrid
2 Locations

ZS Logo ZS

Test Associate Consultant

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
Hybrid
2 Locations

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees
Not Eligible
Save
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account