Principal Software Engineer, AIOps

Reposted 6 Days Ago
Be an Early Applicant
2 Locations
In-Office
Expert/Leader
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Role
Lead the design and architecture of a high-performance, scalable observability and prediction platform for AI factories, collaborating with teams on technical standards and mentoring staff.
Summary Generated by Built In

NVIDIA is powering the world’s most advanced AI Factories. To ensure their seamless operation, we are building a mission-critical Observability and Prediction platform. This platform is delivered as a dual-delivery model: both as a high-scale SaaS solution and as a robust on-premises deployment for our largest enterprise customers.

We are looking for a Principal Engineer to lead the architectural vision of the platform’s core. In this role, you will be the internal technical authority responsible for building a unified, high-performance engine that processes massive telemetry streams and runs advanced predictive models, regardless of where the infrastructure resides.

 

What you’ll be doing:

  • Unified Architectural Vision: Lead the design of a flexible, high-scale architecture that supports both multi-tenant SaaS environments and complex on-premises deployments.
  • Operationalizing Predictive Models: Bridge the gap between AI research and production by architecting the framework that runs sophisticated predictive algorithms at scale, ensuring they are robust enough for mission-critical environments.
  • High-Scale Engineering: Design distributed systems to handle the extreme telemetry density of large-scale AI clusters, ensuring efficient data ingestion, processing, and real-time analysis.
  • Cross-Organizational Leadership: Collaborate with networking and infrastructure teams to define the technical standards that enable the AIOps platform to integrate seamlessly with global AI infrastructure.
  • Technical Excellence: Drive the engineering roadmap, mentor senior staff, and serve as the final authority on architectural decisions, ensuring the platform meets the highest standards of reliability and scalability.

What we need to see:

  • Education: B.Sc./M.Sc. in Computer Science, Computer Engineering, or a related technical field.
  • Experience: 12+ years of experience in software engineering, with a proven track record of architecting complex, high-scale products delivered via SaaS and/or on-premises enterprise models.
  • Architectural Sovereignty: Deep expertise in building environment-agnostic distributed systems, using technologies like Kubernetes to ensure portability across cloud and private data centers.
  • Core Systems Programming: Expert-level proficiency in languages such as Go, C++, or Rust, with a focus on high-performance, concurrent architectures.
  • Data Infrastructure: Extensive experience with high-throughput data processing (e.g., Apache Kafka) and managing large-scale telemetry or time-series data.

Ways to stand out from the crowd:

  • The "0 to 1" Mindset: A proven track record of taking a complex architectural concept from a whiteboard to a stabilized, production-grade platform.
  • A "Systems" Thinker: You don't just write software; you understand the full stack, from how data moves across the wire to how it’s processed in a distributed cluster.
  • Infrastructure Evangelist: Experience in leading large-scale technical migrations or introducing modern engineering paradigms (like Cloud-Native or GitOps) into complex, high-stakes environments.
  • Practical Innovation: The ability to simplify complex problems and build internal tools or frameworks that empower other engineering teams to move faster.

#LI-Hybrid

Top Skills

Apache Kafka
C++
Go
Kubernetes
Rust
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
21,960 Employees
Year Founded: 1993

What We Do

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, NVIDIA is increasingly known as “the AI computing company.”

Similar Jobs

HiBob Logo HiBob

Application Security Engineer

HR Tech • Information Technology • Professional Services • Sales • Software
Remote or Hybrid
Israel
1350 Employees

HiBob Logo HiBob

Team Lead

HR Tech • Information Technology • Professional Services • Sales • Software
Remote or Hybrid
Israel
1350 Employees

HiBob Logo HiBob

Senior Data Engineer

HR Tech • Information Technology • Professional Services • Sales • Software
Remote or Hybrid
Israel
1350 Employees

HiBob Logo HiBob

Architect

HR Tech • Information Technology • Professional Services • Sales • Software
Remote or Hybrid
Israel
1350 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account