Senior Software Engineer - NIM Production and Automation

Posted Yesterday
Be an Early Applicant
2 Locations
In-Office
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
Develop components for NVIDIA Inference Microservices, ensuring performance and scalability using Docker and Kubernetes. Mentor team members and manage automation applications and CI/CD pipelines.
Summary Generated by Built In

NVIDIA is the platform upon which every new AI-powered application is built. We are seeking a Senior Software Engineer to develop components that are used by the software factory automation for NVIDIA Inference Microservices (NIMs) and its deployed services. The right person for this role brings technical drive and creativity to change the way NVIDIA provides high-performance inferencing for every AI model. Our NIM offerings are easy to use, optimized for performance, and developed using a highly automated software factory. We create containers available for download and hosted services.

You will apply your expertise to develop highly available services that make effective use of the thousands of GPU involved in this operation. Your services provide the best-in-class performance, accuracy and availability. We are looking for technical talent to design, build, operate and improve our capabilities to produce NIMs at scale, including the underlying infrastructure, pipelines, inference backends, Docker build, test harness, metrics, performance engineering, log ingestion, and more.

What you'll be doing:

  • Design, build, and optimize containerized inference execution for various AI applications, ensuring efficiency and scalability. These applications may run in container orchestration platforms like Kubernetes to enable scalable and robust deployment.

  • Develop and deploy automation applications and microservices (e.g., in Python, Go) supporting the NIM factory.

  • Ensure the performance, scalability, and availability of NIMs and the automation infrastructure through comprehensive performance measurement, monitoring, and optimization.

  • Implement and manage CI/CD pipelines for automated testing and deployment.

  • Apply container and orchestration expertise (Docker, Kubernetes) to create and optimize the basic building blocks of NIMs and automation tooling.

  • Collaborate, brainstorm, and improve the designs of inference solutions with a broad team of software engineers, researchers, SREs, and product management.

  • Mentor and collaborate with team members and other teams to foster growth and development. Demonstrate a history of learning and enhancing both personal skills and those of colleagues.

What we need to see:

  • A history of using advanced programming skills (e.g., Python, Go) to build distributed compute systems, backend services, microservices, and cloud technologies.

  • Experience productionizing and deploying various types of AI models (e.g., foundation models, computer vision, speech recognition).

  • Experience implementing robust CI/CD pipelines for automated testing and deployment.

  • Effective experience working with multi-functional teams, principals, and architects across organizational boundaries.

  • Mentorship and the ability to grow teams and team members.

  • Deep technical expertise in distributed containerized applications using Docker, Kubernetes, Cloud Endpoints, Helm, and Prometheus.

  • Passion for building scalable and performant microservice applications.

  • Excellent interpersonal skills and the flexibility to lead multi-functional efforts.

  • Proven experience debugging and analyzing the performance of distributed microservices or cloud systems.

  • A degree in Computer Science, Computer Engineering, or a related field (BS or MS) or equivalent experience.

  • 6+ years of demonstrated experience in developing performant microservices, cloud software, and/or tooling roles.

Ways to stand out from the crowd:

  • Experience with multiple container engines, internals of the container image and runtime.

  • Prior experience in building and deploying containers for Microservices, Cloud, and On-prem deployments.

  • Background with large-scale full-stack development.

  • Experience delivering event-driven applications using services such as Temporal, Kafka, Redis, or similar.

  • Previous work in large-scale backend development.

We are widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and creative people in the world working for us. If you're creative and autonomous with a real passion for technology, we want to hear from you.

Top Skills

Cloud Endpoints
Docker
Go
Helm
Kubernetes
Prometheus
Python
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

Hitachi Logo Hitachi

Product Specialist

Fintech • Information Technology • Logistics
In-Office or Remote
9 Locations
33676 Employees

Hitachi Logo Hitachi

Product Specialist

Fintech • Information Technology • Logistics
In-Office or Remote
6 Locations
33676 Employees

Hewlett Packard Enterprise Logo Hewlett Packard Enterprise

Sales Specialist – Compute - Hanoi City

Artificial Intelligence • Cloud • Information Technology • Consulting
In-Office
Hanoi, VNM
61628 Employees

NVIDIA Logo NVIDIA

Product Engineer

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office
Hanoi, VNM
21960 Employees

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account