Senior Software Engineer - Kubernetes AI Scheduler

Posted 3 Hours Ago
Be an Early Applicant
Tel Aviv, ISR
In-Office
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
Design, implement, and maintain Go-based scheduler components for AI workloads on Kubernetes. Improve scalability for massive clusters, solve placement and optimization problems, lead code/design reviews, mentor engineers, and collaborate with upstream projects, contributors, and customers to translate production feedback into engineering improvements.
Summary Generated by Built In

KAI-Scheduler is an open-source CNCF project focused on delivering the best scheduling experience for AI workloads on Kubernetes. Adopted by AI frontier labs, leading enterprises, and some of the largest AI infrastructure deployments in the world, KAI helps organizations efficiently run AI at scale.

KAI is designed to support any AI infrastructure—from the latest GPU and networking technologies to future hardware generations—while maximizing performance, utilization, and scalability. As a Senior Software Engineer for KAI, you will help build the future of AI scheduling in the Kubernetes ecosystem, working on challenging problems spanning workload scheduling, Kubernetes internals, and large-scale AI infrastructure.

What you’ll be doing:

  • Develop clean, maintainable, and well-tested software in Go.
  • Design and implement scalability improvements for KAI, helping it operates efficiently in massive-scale deployments (thousands of nodes) while addressing Kubernetes scaling constraints and bottlenecks.
  • Apply strong algorithmic thinking to solve complex AI workload scheduling and placement challenges, balancing performance, fairness, cluster utilization, topology constraints, and scalability.
  • Conduct code and design reviews to uphold high-quality standards and mentor team members.
  • Work closely with contributors, users, and customers, helping translate feedback from production deployments into product and engineering improvements.
  • Collaborate with related upstream projects (schedulers, AI frameworks, cluster autoscalers, Kubernetes SIGs/WGs, etc.) and contribute to community and ecosystem discussions.

What we need to see:

  • B.Sc. or M.Sc. in Computer Science or a related field or equivalent experience
  • 8+ years of experience in backend software development, including system design and architecture
  • 4+ years of advanced Kubernetes development experience, including designing and implementing CRDs and controllers, with deep expertise in Kubernetes internals, networking, storage, and cluster architecture.
  • Strong algorithmic skills with experience tackling complex optimization and distributed systems challenges.
  • Strong technical skills and a proven ability to collaborate with and mentor other engineers.

We are an equal-opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Skills Required

  • B.Sc. or M.Sc. in Computer Science or related field or equivalent experience
  • 8+ years backend software development experience including system design and architecture
  • 4+ years advanced Kubernetes development experience, including designing and implementing CRDs and controllers
  • Deep expertise in Kubernetes internals, networking, storage, and cluster architecture
  • Proficiency in Go (Golang) and writing well-tested backend software
  • Strong algorithmic skills and experience with optimization and distributed systems challenges
  • Proven ability to collaborate, conduct code/design reviews, and mentor other engineers

NVIDIA Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about NVIDIA and has not been reviewed or approved by NVIDIA.

  • Equity Value & Accessibility Equity awards and a discounted ESPP are highlighted as core parts of total compensation, enabling employees to share in the company’s success. Stock-based compensation and the two-year lookback ESPP are consistently described as especially valuable.
  • Healthcare Strength Health coverage is portrayed as robust, with comprehensive medical, dental, and vision options alongside mental health support and on-site care resources. Employer HSA contributions and wellness perks reinforce the depth of the offering.
  • Retirement Support Retirement programs are depicted as strong, featuring a meaningful 401(k) match with Roth options and support for Mega Backdoor Roth contributions. These elements position long-term savings as a notable advantage of the total rewards package.

NVIDIA Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

Datadog Logo Datadog

Commercial Account Executive

Artificial Intelligence • Cloud • Security • Software • Cybersecurity
Easy Apply
Hybrid
Tel Aviv, ISR
6500 Employees

Riskified Logo Riskified

Product Manager

Big Data • eCommerce • Fintech • Machine Learning • Payments • Software
Hybrid
Tel Aviv, ISR
680 Employees

Riskified Logo Riskified

Product Strategy Director

Big Data • eCommerce • Fintech • Machine Learning • Payments • Software
Hybrid
Tel Aviv, ISR
680 Employees

monday.com Logo monday.com

Head of GTM Service

Artificial Intelligence • Productivity • Sales • Software
Hybrid
Tel Aviv, ISR
3049 Employees

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account