Senior Software Engineer, CUDA Core Libraries

Reposted Yesterday
Be an Early Applicant
Hiring Remotely in Munich, Bayern, DEU
In-Office or Remote
300K-300K Annually
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
As a Senior Software Engineer, you will develop CUDA Core Libraries in C++ and Python, optimizing GPU algorithms and improving developer experience across stacks through collaboration and independent project management.
Summary Generated by Built In

NVIDIA’s accelerated computing platform is the foundation of modern HPC and AI.At the core of this platform are the CUDA Core Libraries. C++ and Python libraries that enable developers to write fast, reliable, and scalable GPU-accelerated software! We are hiring a full-time Software Engineer to work on the CUDA Core Libraries that power GPU computing for both C++ and Python developers. This includes projects such as CCCL (Thrust, CUB, libcudacxx), cuda-python, and numba-cuda. You will join the team building the foundational libraries, algorithms, and language/runtime infrastructure that make CUDA a speed-of-light experience for developers across deep learning, scientific computing, and data analytics!

What you’ll be doing:

  • Develop and implement CUDA Core Libraries in C++ and/or Python, including parallel algorithms and idiomatic language bindings for core CUDA functionality.

  • Compose, optimize, and evolve GPU algorithms and APIs, from high-level interfaces down to low-level performance tuning involving memory, parallelism, and synchronization.

  • Own features end-to-end: develop, implementation, testing, benchmarking, documentation, and long-term maintenance.

  • Improve developer experience across the stack: CI, tests, benchmarks, packaging, examples, and docs.

  • Collaborate with senior CUDA engineers in design reviews, code reviews, and open-source-style workflows.

  • Engage with real users through issues, performance investigations, and API feedback.

What we need to see:

  • BS, MS, or PhD in Computer Science, Computer Engineering, or a related field or equivalent experience.

  • Minimum of 8+ years of related development experience

  • Strong programming skills in C++, Python, or both, with proven interest in systems-level software (performance, memory, concurrency, API design).

  • Solid understanding of modern C++ (templates, generics, standard library) and/or Python library development and packaging.

  • Practical experience with parallel or heterogeneous programming (CUDA, OpenMP, GPU-accelerated Python, or similar).

  • Experience contributing to production software or open-source libraries, including testing, profiling, and code review.

  • Ability to work independently, scope problems, and drive projects to completion.

  • Clear written communication for technical design and documentation.

  • Comfort navigating large, multi-language codebases (C++, Python, CMake, Pixi, CI systems).

Ways to stand out from the crowd:

  • Strong understanding of CPU/GPU architecture and how hardware details affect performance.

  • Hands-on experience with CUDA C++, CUDA Python, PyTorch, JAX, Numba, CuPy, or similar GPU-accelerated stacks.

  • Familiarity with Thrust, CUB, libcudacxx, or other modern C++/GPU libraries.

  • Experience with compiler infrastructure or tooling (LLVM, Clang tooling, MLIR).

  • Demonstrated interest in developer tools, library design, and making other developers faster.

If you care deeply about performance, enjoy working at the C++/Python boundary, and want to shape the core CUDA libraries relied on by thousands of developers, this role is a direct fit.

Top Skills

C++
Cuda
Python
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

Zscaler Logo Zscaler

Account Executive

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Remote or Hybrid
Germany
8697 Employees

Rapid7 Logo Rapid7

Sales Development Representative

Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
Remote or Hybrid
Germany
2400 Employees

Rapid7 Logo Rapid7

Account Executive

Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
Remote or Hybrid
Germany
2400 Employees

Cloudflare Logo Cloudflare

Solutions Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Remote or Hybrid
2 Locations
4400 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account