We are now looking for a Senior System Software Engineer. NVIDIA is the leading artificial intelligence computing company and paving the way with innovations in self-driving cars, machine learning, supercomputing, gaming and visualization. NVIDIA gives automakers, research institutions, cloud providers, large companies, and start-ups the power and flexibility to develop and deploy breakthrough artificial intelligence systems. We are an enthusiastic and dedicated team at the forefront of the latest science and technology trends. Working together, we provide a private on-site cloud solution that enables the rest of the organization to quickly release high-quality software. Are you passionate about infrastructure and looking for complex and challenging issues? Are you ready to build the next generation of cloud services, design innovative solutions that address the needs of a whole organization? Then we are excited to have a motivated person like you!
What you'll be doing:
Spearhead innovation to architect and deliver highly reliable, performant, and scalable cloud-native systems.
Lead the design and development of next-generation microservices and distributed systems with a strong emphasis on performance optimization and cost efficiency.
Define and evolve system architecture strategies, ensuring alignment with long-term business and technical goals.
Tackle complex challenges in job orchestration, resource optimization, and self-healing infrastructure with a focus on automation and resilience.
Build and scale end-to-end observability solutions including metrics pipelines, alerting frameworks, and telemetry storage.
Leverage data analytics and predictive modeling to proactively improve system behavior and reliability.
Provide technical leadership and mentorship across teams while collaborating cross-functionally with product, infrastructure, and operations groups to drive strategic initiatives and foster a culture of engineering excellence and continuous improvement.
Design and operate massively scalable systems—handling thousands to millions of jobs and servers—using deep expertise in Kubernetes and public cloud platforms (AWS, Azure, GCP).
What we need to see:
Demonstrated experience in building and scaling large-scale cloud infrastructure platforms.
10+ years of proven experience in software engineering with a strong track record of delivering enterprise-grade cloud solutions; BS/MS/Ph.D. in Computer Science, Computer Engineering, or equivalent experience.
Deep expertise in microservices architecture, with hands-on experience designing and developing scalable, distributed systems.
Extensive experience with public cloud platforms (AWS, Azure, GCP), including scaling infrastructure to support thousands to millions of jobs and servers.
Strong Kubernetes expertise, including container orchestration and cloud-native tooling for deployment, monitoring, and management.
Proficiency in both SQL (e.g., MySQL) and NoSQL (e.g., Elasticsearch) databases, with a solid understanding of scalable storage systems.
Hands-on experience with Web Services (SOAP/REST), messaging systems like Kafka, and CI/CD tools such as Jenkins, Git, and Perforce.
Excellent debugging, problem-solving, and communication skills, with the ability to lead and collaborate effectively in a globally distributed, multi-time-zone environment.
Ways to stand out from the crowd:
Proven ability to deconstruct complex systems into modular, scalable components with measurable outcomes and scale systems to handle millions of concurrent jobs and global workloads.
Expertise in optimizing cloud infrastructure for performance, reliability, and cost.
Solid collaborative and interpersonal skills, specifically a proven ability to effectively guide and influence within a dynamic environment
Relentless drive to push the boundaries of system performance and reliability.
We are an equal opportunity employer and value diversity at our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform crucial job functions, and to receive other benefits and privileges of employment.
Top Skills
What We Do
NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”





