Kubernetes Architect

Reposted 24 Days Ago
Hiring Remotely in USA
Remote
Senior level
Artificial Intelligence • Cloud • Software
The Role
Lead Kubernetes infrastructure design and deployment for AI applications, ensuring scalability and security while mentoring teams in best practices.
Summary Generated by Built In

At TensorWave, we’re leading the charge in AI compute, building a versatile cloud platform that’s driving the next generation of AI innovation. We’re focused on creating a foundation that empowers cutting-edge advancements in intelligent computing, pushing the boundaries of what’s possible in the AI landscape.

About the Role:

We are seeking an exceptional Kubernetes Architect to lead the design, development, and deployment of our next-generation infrastructure platform. This is a very senior-level role for someone who not only understands Kubernetes deeply but can write complex manifests, operators, and controllers from scratch, and architect resilient, secure, and performant systems that scale to millions of users.

As a technical visionary and hands-on expert, you will lead the evolution of our cloud-native architecture, including designing serverless systems on Kubernetes, integrating with CI/CD, and ensuring observability, security, and cost-efficiency across environments.

Responsibilities:
  • Architect and implement end-to-end Kubernetes infrastructure for large-scale, cloud-native applications.

  • Design and build serverless platforms on top of Kubernetes using technologies such as Knative, OpenFaaS, or KEDA.

  • Develop and maintain Kubernetes custom resources (CRDs), controllers, operators, and admission controllers in Go or Python.

  • Define multi-tenant, multi-region architecture supporting millions of users with high availability and low latency.

  • Lead Kubernetes cluster lifecycle management (provisioning, upgrades, scaling, monitoring, troubleshooting).

  • Collaborate closely with engineering teams to containerize applications, write Helm charts or Kustomize overlays, and standardize deployment practices.

  • Implement infrastructure as code using tools like Terraform, Pulumi, or Crossplane.

  • Lead efforts around observability, policy enforcement, cost optimization, and RBAC/security hardening within the cluster.

  • Evaluate and integrate Kubernetes ecosystem tools (e.g., Istio/Linkerd, ArgoCD, Flux, Prometheus, Grafana, OPA, etc.).

  • Mentor and upskill DevOps engineers and SREs in Kubernetes best practices.

Essential Skills & Qualifications:
  • 8+ years of experience in cloud infrastructure, DevOps, or platform engineering roles.

  • 4+ years of hands-on Kubernetes experience, including deep knowledge of the Kubernetes API, internals, networking, and storage.

  • Proficiency in writing Kubernetes manifests, Helm charts, and custom Kubernetes controllers/operators (preferably in Go).

  • Proven experience designing cloud-native systems that scale globally (multi-region, multi-cloud or hybrid setups).

  • Experience with serverless technologies (Knative, OpenFaaS, AWS Lambda, etc.) in a production environment.

  • Strong knowledge of cloud platforms such as AWS, GCP, or Azure.

  • Experience with GitOps tools (ArgoCD, Flux), service meshes, policy engines (OPA/Gatekeeper), and CI/CD pipelines.

  • Deep understanding of security, compliance, and resilience in containerized workloads.

Additional/Preferred Qualifications:
  • Contributions to Kubernetes open-source projects or CNCF-related tooling.

  • Experience with service mesh design (Istio, Linkerd).

  • Familiarity with eBPF, Cilium, or network-level observability.

  • Background in building PaaS or developer platforms on top of Kubernetes.

What Success Looks Like:
  • A production-grade Kubernetes platform that can support millions of users globally, with self-healing, autoscaling, and strong observability.

  • Developer teams can deploy serverless applications with ease, speed, and reliability.

  • Infrastructure is resilient, secure, cost-optimized, and compliant.

  • Kubernetes practices and tooling are well-documented, standardized, and continuously improved across the company.

What We Bring:
  • Stock Options

  • 100% paid Medical, Dental, and Vision insurance

  • Life and Voluntary Supplemental Insurance

  • Short Term Disability Insurance

  • Flexible Spending Account

  • 401(k)

  • Flexible PTO

  • Paid Holidays

  • Parental Leave

  • Mental Health Benefits through Spring Health

Top Skills

Argocd
Crossplane
Flux
Go
Grafana
Helm
Istio
Keda
Knative
Kubernetes
Linkerd
Opa
Openfaas
Prometheus
Pulumi
Python
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Las Vegas, Nevada
56 Employees

What We Do

TensorWave is a cutting-edge cloud platform designed specifically for AI workloads. Offering AMD MI300X accelerators and a best-in-class inference engine, TensorWave is a top-choice for training, fine-tuning, and inference. Visit tensorwave.com to learn more.
Send us a message to try it for free.

Similar Jobs

TransUnion Logo TransUnion

Managers

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Remote or Hybrid
United States
13000 Employees

Samsara Logo Samsara

Program Manager

Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Easy Apply
Remote or Hybrid
United States
4000 Employees
100K-152K Annually

Cloudflare Logo Cloudflare

Account Executive

Cloud • Information Technology • Security • Software • Cybersecurity
Remote or Hybrid
2 Locations
4400 Employees

Samsara Logo Samsara

Senior Product Marketing Manager

Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Easy Apply
Remote or Hybrid
United States
4000 Employees
117K-197K Annually

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account