Lead Principal Engineer, Enterprise Agentic AI Platform

Reposted 2 Days Ago
Be an Early Applicant
Santa Clara, CA, USA
In-Office
272K-431K Annually
Expert/Leader
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
Design, build, and ship production-grade agentic AI systems and platform services using Python/Go. Implement multi-agent orchestration, memory and RAG pipelines, Kubernetes deployment, GPU inference tuning, observability, security controls, and evaluation/telemetry. Prototype rapidly, harden POCs into SDKs/APIs, and collaborate across teams to operationalize scalable, secure enterprise AI agents.
Summary Generated by Built In

Join NVIDIA IT’s Enterprise AI & Automation team to develop and expand enterprise-grade agentic AI systems at one of the world’s most advanced AI companies. NVIDIA’s Enterprise AI Platform drives production AI agents that securely link with enterprise systems to boost employee efficiency and accelerate business results across engineering, IT, supply chain, finance, HR, and sales. We need a Principal or Distinguished Engineer–level architect who defines systems through direct construction. This role calls for a deeply involved technical leader writing code daily in Python and/or Go. They quickly develop prototypes using modern code-generation tools like Cursor, Claude Code, and Claude Cowork. The candidate must grasp infrastructure aspects from Kubernetes to GPU inference stacks and translate new agent development patterns into scalable platform capabilities. You will build NVIDIA’s enterprise agent architecture by delivering functional systems, developing reference implementations, and elevating the technical standards across the organization. This is not a strategy-only or governance-only role. Architecture authority is earned through production systems, measurable impact, and technical depth.

If you prosper in unclear environments, rapidly move from concepts to operational systems, and view the full agent development process—create, sandbox, launch, observe, control, and continuously enhance using data-driven cycles—this position lets you define enterprise-grade agentic AI at NVIDIA scale. You will invent systems that incorporate persistent memory, controlled runtime environments, strict assessment, and GPU-powered performance, ensuring agents are intelligent, trackable, protected, and production-ready from day one.

What You Will Be Doing:

  • Develop and deliver production-quality agentic AI systems from start to finish using Python and/or Go, covering Kubernetes deployment, agent runtimes, memory systems, orchestration, tool integration, and evaluation pipelines.

  • Define and advance NVIDIA’s Enterprise Agentic AI architecture through practical implementations, reference systems, and production deployments—not abstract diagrams.

  • Build and implement multi-agent orchestration patterns (planner, executor, reviewer, tool agents) using frameworks such as LangChain, LangGraph, or similar orchestration systems, with strong regression coverage and observability.

  • Run fast, high-quality POCs on emerging agent architectures; harden successful patterns into reusable platform services, APIs, SDKs, and developer templates.

  • Architect and implement data flywheels that continuously improve agent quality through telemetry, benchmarking, automated evaluation, and structured feedback loops.

  • Embed security, guardrails, sandbox isolation, auditability, and policy enforcement directly into agent runtimes in partnership with security and governance teams.

  • Evaluate, integrate, and extend open-source and third-party agent platforms; drive disciplined build-vs-use decisions based on performance, scalability, control, and long-term platform ownership.

  • Collaborate closely with engineering, infrastructure, product, and business collaborators to align architectural direction with enterprise priorities and accelerate adoption.

What We Need to See:

  • Bachelor’s degree in Computer Science or related field or equivalent experience; Master’s or PhD preferred.

  • 15+ years of experience building and shipping large-scale distributed systems with significant hands-on coding in Python, Go, or similar systems languages.

  • Proven skill in quickly transitioning from an idea to a functional prototype and then to a robust, scalable platform solution.

  • Proven track record in constructing agentic AI systems, including RAG pipelines, long-lasting memory models, multi-agent management (e.g., LangChain, LangGraph), tool frameworks, and evaluation infrastructure.

  • Expert-level depth in Kubernetes, containerized workloads, networking, APIs, and secure enterprise integration patterns.

  • Experience crafting benchmarking, regression testing, telemetry, and observability systems that measure agent quality, latency, cost, reliability, and safety.

  • Comprehensive knowledge of performance tuning in hybrid environments, including GPU-based inference systems.

  • Excellent collaboration skills with the ability to influence cross-functional collaborators, build positive relationships, and clearly communicate complex architectural concepts to both technical and business audiences.

Ways to Stand Out from the Crowd:

  • Proven experience delivering reusable developer-acceleration components such as SDKs, APIs, templates, reference implementations, and CI/CD automation.

  • Experience integrating enterprise vector databases and retrieval systems, and working with agentic search and orchestration platforms such as Glean, Microsoft Copilot Studio, Google Agentspace, or similar enterprise AI ecosystems.

  • Experience embedding fine-grained policy enforcement, access controls, sandbox isolation, and audit trails directly into AI runtimes.

  • GPU-acceleration approach with experience optimizing model inference, batching strategies, memory utilization, and efficiency on NVIDIA hardware.

  • Evidence of meaningful open-source contributions, including core commits, maintainership, widely adopted libraries, or public technical artifacts demonstrating system-level depth.

NVIDIA is widely recognized as one of the technology world’s most sought-after employers. We have some of the most forward-thinking and creative people in the world working here. If you are driven by impact, passionate about enterprise AI, and encouraged to develop the blueprint of agentic AI at scale, we want to hear from you. Apply today!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 431,250 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until March 2, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

APIs
Ci/Cd
Claude Code
Claude Cowork
Containerization
Cursor
Glean
Go
Google Agentspace
Gpu Inference Stacks
Kubernetes
Langchain
Langgraph
Microsoft Copilot Studio
Observability
Python
Rag
Sdks
Telemetry
Vector Databases
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

GRAIL Logo GRAIL

Senior Data Scientist

Artificial Intelligence • Big Data • Healthtech • Machine Learning • Software • Biotech
Hybrid
Menlo Park, CA, USA
918 Employees
156K-187K Annually

ServiceNow Logo ServiceNow

Controller

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
San Francisco, CA, USA
28000 Employees
184K-288K Annually

Airwallex Logo Airwallex

Senior Associate, Revenue Operations

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Remote or Hybrid
San Francisco, CA, USA
2000 Employees

Airwallex Logo Airwallex

Senior Manager, Revenue Strategy & Enablement, Enterprise, Americas

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Remote or Hybrid
San Francisco, CA, USA
2000 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account