(Senior) Software Engineer, Infrastructure (Kubernetes Platform)

Reposted 15 Days Ago
Fremont, CA
In-Office
120K-240K Annually
Mid level
Artificial Intelligence • Software • Transportation
The Role
Design, operate, and optimize Kubernetes clusters across hybrid cloud environments. Contribute to platform features, lifecycle management, security best practices, and collaborate with cross-functional teams to enhance automation and reliability.
Summary Generated by Built In

Founded in 2016 in Silicon Valley, Pony.ai has quickly become a global leader in autonomous mobility and is a pioneer in extending autonomous mobility technologies and services at a rapidly expanding footprint of sites around the world. Operating Robotaxi, Robotruck and Personally Owned Vehicles (POV) business units, Pony.ai is an industry leader in the commercialization of autonomous driving and is committed to developing the safest autonomous driving capabilities on a global scale. Pony.ai’s leading position has been recognized, with CNBC ranking Pony.ai #10 on its CNBC Disruptor list of the 50 most innovative and disruptive tech companies of 2022. In June 2023, Pony.ai was recognized on the XPRIZE and Bessemer Venture Partners inaugural “XB100” 2023 list of the world’s top 100 private deep tech companies, ranking #12 globally. As of August 2023, Pony.ai has accumulated nearly 21 million miles of autonomous driving globally. Pony.ai went public at NASDAQ in November 2024.

Responsibilities

As a (Senior) Kubernetes Engineer, you will:

  • Design, operate, and optimize Kubernetes clusters across hybrid cloud environments (public cloud and on-prem datacenter).
  • Support diverse workloads including large-scale model training and low-latency inference services.
  • Develop, maintain, and extend Kubernetes platform features (operators, CRDs, APIs) to automate and productize internal use cases.
  • Own cluster lifecycle management including upgrades, patching, configuration, and governance.
  • Define and enforce best practices for service deployments, security policies, and operational guidelines.
  • Contribute to observability and SRE practices to ensure reliability at scale (SLOs, incident reviews, metrics-driven improvements).
  • Collaborate with storage, compute, and networking teams (CNI, ingress, service discovery) to enhance automation, availability, and performance.
    Provide technical mentorship, documentation, and on-call support for cluster-related incidents.

Requirements
  • Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience.
  • 3+ years of hands-on experience managing Kubernetes clusters in production (EKS/GKE/AKS and/or bare-metal).
  • Strong Linux systems background and distributed systems fundamentals (scheduling, reliability, scaling).
  • Proven experience with hybrid cloud environments (AWS, GCP, Azure, and on-prem).
  • Expertise in containerization (Docker) and Infrastructure-as-Code tools (Terraform, Helm, Ansible, or similar).
  • Experience developing and maintaining Kubernetes platform features (operators, CRDs, APIs).
  • Solid knowledge of Kubernetes networking (CNI, ingress, service discovery), storage, and compute integrations.
  • Strong understanding of security best practices (RBAC, network policies, secrets).
  • Effective communication skills and ability to work cross-functionally in a fast-paced environment.
Preferred Experience
  • Programming skills in Go and/or Python for operator development, platform automation, and tooling.
  • Experience with observability and SRE practices (Prometheus, Grafana, ELK, Datadog; SLOs, incident response, postmortems).
  • Familiarity with workloads common to AI/ML systems (training, inference).
Compensation and Benefits

Base Salary Range: $120,000 - $240,000 Annually

Compensation may vary outside of this range depending on many factors, including the candidate’s qualifications, skills, competencies, experience, and location. Base pay is one part of the Total Compensation and this role may be eligible for bonuses/incentives and restricted stock units.

Also, we provide the following benefits to the eligible employees:

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (Traditional and Roth 401k)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off (Vacation & Public Holidays)
  • Family Leave (Maternity, Paternity)
  • Short Term & Long Term Disability
  • Free Food & Snacks

Please click here for our privacy disclosure.

Top Skills

Ansible
AWS
Azure
Docker
GCP
Helm
Kubernetes
Linux
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Fremont, CA
512 Employees
Year Founded: 2016

What We Do

Pony AI Inc. (“Pony.ai”) is a global leader in the large-scale commercialization of autonomous mobility.
Leveraging its vehicle-agnostic Virtual Driver technology, full-stack autonomous driving technology that seamlessly integrates its proprietary software, hardware, and services, Pony.ai is developing a commercially viable and sustainable business model that enables the mass production and deployment of vehicles across transportation use cases.
Founded in 2016, Pony.ai has expanded its presence across China, Europe, East Asia, the Middle East, and other regions, ensuring widespread accessibility to its advanced technology.
Pony.ai is among the first in China to obtain licenses to operate fully driverless vehicles in all four Tier-1 cities in China (Beijing, Guangzhou, Shanghai, Shenzhen) and has begun to offer public-facing, fare-charging robotaxi services without safety drivers in Beijing, Guangzhou and Shenzhen. Pony.ai operates a fleet consisting of over 250 robotaxis.
To date, Pony.ai has driven nearly 45 million autonomous testing and operation kilometers on open roads worldwide.

Similar Jobs

Anduril Logo Anduril

Senior Electrical Engineer

Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
In-Office
Costa Mesa, CA, USA
6000 Employees
146K-194K Annually

Anduril Logo Anduril

Manager of FP&A

Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
In-Office
Costa Mesa, CA, USA
6000 Employees
129K-171K Annually

Anduril Logo Anduril

Electrical Engineer

Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
In-Office
Costa Mesa, CA, USA
6000 Employees
129K-171K Annually

Xero Logo Xero

Senior Engineer

Cloud • Fintech • Information Technology • Machine Learning • Software
Remote or Hybrid
2 Locations
4500 Employees
180K-219K Annually

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account