Principal Kubernetes Platform Engineer

Reposted 13 Days Ago
Hiring Remotely in USA
Remote
Senior level
Artificial Intelligence • Cloud • Software
The Role
Lead Kubernetes infrastructure design and deployment for AI applications, ensuring scalability and security while mentoring teams in best practices.
Summary Generated by Built In

Our mission at Tensorwave Cloud is to build seamless, secure, reliable, and resilient AI infrastructure at scale, eliminating barriers and challenging the status quo to empower builders and support AI innovation.

About the role

We are seeking a Principal Platform Engineer to lead the design, development, and deployment of our next-generation Kubernetes platform.

In this role, you will define what production excellence looks like at scale: a global, self-healing, autoscaling Kubernetes platform with strong observability, security, and cost efficiency, capable of supporting millions of users.

As a technical leader and hands-on architect, you will build and evolve cloud-native and serverless systems on Kubernetes, writing complex manifests, operators, and controllers from scratch.

You will set standards and best practices across the company, ensuring platform tooling is well-documented, reliable, and continuously improved, while enabling developer teams to deploy applications with speed, confidence, and minimal friction.

Responsibilities

  • Architect and implement end-to-end Kubernetes infrastructure for large-scale, cloud-native applications

  • Design and build serverless platforms on top of Kubernetes using technologies such as Knative, OpenFaaS, or KEDA

  • Develop and maintain Kubernetes custom resources (CRDs), controllers, operators, and admission controllers in Go or Python

  • Define multi-tenant, multi-region architecture supporting millions of users with high availability and low latency

  • Lead Kubernetes cluster lifecycle management - provisioning, upgrades, scaling, monitoring, troubleshooting

  • Collaborate closely with engineering teams to containerize applications, write Helm charts or Kustomize overlays, and standardize deployment practices

  • Implement infrastructure as code using tools like Terraform, Pulumi, or Crossplane

  • Lead efforts around observability, policy enforcement, cost optimization, and RBAC/security hardening within the cluster

  • Evaluate and integrate Kubernetes ecosystem tools - Istio/Linkerd, ArgoCD, Flux, Prometheus, Grafana, OPA

  • Mentor and upskill DevOps engineers and SREs in Kubernetes best practices

Required Experience

  • Bachelor of Science in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience

  • 8+ years of experience in cloud infrastructure, DevOps, or platform engineering roles

  • 8+ years of hands-on Kubernetes experience, including deep knowledge of the Kubernetes API, internals, networking, and storage

  • Proficiency in writing Kubernetes manifests, Helm charts, and custom Kubernetes controllers/operators

  • Proven experience designing cloud-native systems that scale globally - multi-region, multi-cloud or hybrid setups

  • Experience with serverless technologies in production - Knative, OpenFaaS, AWS Lambda

  • Strong knowledge of cloud platforms such as AWS, GCP, or Azure

  • Experience with GitOps tools - ArgoCD, Flux

  • Deep understanding of security, compliance, and resilience in containerized workloads

Preferred Experience

  • Contributions to Kubernetes open-source projects or CNCF-related tooling

  • Experience with service mesh design (Istio, Linkerd)

  • Familiarity with eBPF, Cilium, or network-level observability

  • Background in building PaaS or developer platforms on top of Kubernetes

What We Bring

  • Mission driven company

  • Competitive Salary

  • Stock Options

  • 100% paid Medical, Dental, and Vision insurance

  • Flexible PTO

  • Paid Holidays

  • 401(k)

  • Parental Leave

  • Flexible Spending Account

  • Short Term Disability Insurance

  • Life and Voluntary Supplemental Insurance

  • Mental Health Benefits through Spring Health

We’re looking for resilient, adaptable people to join our team, people who believe in the mission and think at massive scale. The solutions that worked on a handful of devices will not work at Exascale. Be prepared to be pushed daily, to learn a lot, and literally build the future.

Tensorwave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, national origin, or veteran status.

Top Skills

Argocd
Crossplane
Flux
Go
Grafana
Helm
Istio
Keda
Knative
Kubernetes
Linkerd
Opa
Openfaas
Prometheus
Pulumi
Python
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Las Vegas, Nevada
56 Employees

What We Do

TensorWave is a cutting-edge cloud platform designed specifically for AI workloads. Offering AMD MI300X accelerators and a best-in-class inference engine, TensorWave is a top-choice for training, fine-tuning, and inference. Visit tensorwave.com to learn more.
Send us a message to try it for free.

Similar Jobs

Dynatrace Logo Dynatrace

Marketing Intern - Brand & Communications

Artificial Intelligence • Big Data • Cloud • Information Technology • Software • Big Data Analytics • Automation
Remote or Hybrid
Boston, MA, USA
5200 Employees

Dynatrace Logo Dynatrace

Transformation Office Intern

Artificial Intelligence • Big Data • Cloud • Information Technology • Software • Big Data Analytics • Automation
Remote or Hybrid
Boston, MA, USA
5200 Employees
26-26 Hourly

Dynatrace Logo Dynatrace

Finance Intern - Investor Relations

Artificial Intelligence • Big Data • Cloud • Information Technology • Software • Big Data Analytics • Automation
Remote or Hybrid
Boston, MA, USA
5200 Employees
26-26 Hourly

Dynatrace Logo Dynatrace

Sales Enablement Intern - Field Sales

Artificial Intelligence • Big Data • Cloud • Information Technology • Software • Big Data Analytics • Automation
Remote or Hybrid
Boston, MA, USA
5200 Employees
26-26 Hourly

Similar Companies Hiring

Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account