NVIDIA is leading artificial intelligence computing company and paving the way with innovations in self-driving cars, machine learning, supercomputing, gaming and visualization. NVIDIA gives automakers, research institutions, cloud providers, large companies and start-ups the power and flexibility to develop and deploy breakthrough artificial intelligence systems. We are an enthusiastic and dedicated team at the forefront of the latest science and technology trends. Working together, we provide a private on-site cloud solution that enables the rest of the organization to quickly release high-quality software. Are you passionate about infrastructure and looking for complex and craft innovative solutions, mine through data to uncover real problems and fix them? Are you ready to build the next generation of cloud services, design innovative solutions that address the needs of a whole organization? Then we are excited to have a motivated person like you.
What you'll be doing:
Design creative cloud solutions to scale thousands of systems
Responsible for the whole life cycle of new features, from requirements gathering, to design documentation, to validation and deployment.
Working on challenging problems in area of infrastructure such as job scheduling, resource management and automated recovery.
Work on an IaaS platform that is used as a self-reservation for bare-metal machines, where engineers can reserve machines ahead of time for development & debugging purposes. The solution includes a web portal, a CLI client, a Java-based middle layer, and OpenStack backend.
What we need to see:
10+ years of experience designing, building, and deploying large-scale distributed systems in cloud environments
Strong programming expertise in Java and/or Python with a solid object-oriented design background
Proven experience delivering solutions using Agile methodologies
Hands-on experience with cloud infrastructure and containerized platforms (Docker, Kubernetes, Kubernetes Cluster API)
Deep knowledge of distributed systems components, including message brokers
Strong experience with relational databases (MySQL) and NoSQL systems (Elasticsearch)
Excellent debugging, problem-solving, and performance optimization skills
Effective collaborator with strong written and interpersonal communication skills, experienced in working across distributed teams and time zones
BS/MS in Computer Science or Computer Engineering, or practical experience considered equivalent
Ways to stand out from the crowd:
Hands-on experience building and operating distributed systems, containerized workloads, and working directly with the Kubernetes API in production environments.
Strong background in computer algorithms, with a proven ability to select optimal approaches for solving complex and high-scale problems.
Skilled at breaking down complex systems into smaller, reusable components, leveraging existing solutions to deliver robust implementations efficiently.
Proven experience designing, implementing, and rolling out major infrastructure features incrementally across multiple servers with minimal disruption.
Experience applying Agentic AI in infrastructure systems, with a focus on designing simple, reliable architectures that require minimal operational overhead.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and dedicated people in the world working for us. If you're creative and passionate about developing cloud services we want to hear from you! Our company is an equal opportunity employer and values diversity. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.
Top Skills
What We Do
NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”





