Our mission at Tensorwave Cloud is to build seamless, secure, reliable, and resilient AI infrastructure at scale, eliminating barriers and challenging the status quo to empower builders and support AI innovation.
About the role
We are seeking a Senior Kubernetes Platform Engineer focused on support and operations.
You’ll play a critical role in maintaining the stability and reliability of our bare-metal Kubernetes infrastructure and work closely with senior engineers, taking point on troubleshooting, incident response, and day-to-day cluster operations across multi-tenant workloads.
This is a great opportunity for engineers ready to deepen their Kubernetes expertise while supporting cutting-edge AI environments in real-time.
Responsibilities
Own and troubleshoot operational issues within Kubernetes environments
Maintain and monitor core services (e.g., Cilium, HAProxy, Prometheus, etc.)
Ensure uptime, performance, and reliability of multi-tenant clusters
Assist with Ingress/Egress connectivity and network debugging
Support internal and customer teams in secure, isolated VPC environments
Collaborate with senior engineers on automation and cluster lifecycle improvements
Required Experience
Bachelor of Science in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience
5+ years experience in DevOps, SRE, or Linux infrastructure roles
4+ years of hands-on experience with Kubernetes in production
Familiarity with networking, CNI plugins, and core Linux troubleshooting
Strong infrastructure-as-code mindset - Helm, Terraform, Ansible
Solid experience with monitoring and logging tools - Prometheus, Grafana, Loki
Understanding of secure infrastructure design principles and least-privilege access
Preferred Experience
Experience with RKE2, Rancher, or similar platforms
Experience troubleshooting or supporting AI or GPU-based workloads
Familiarity with HAProxy, Cilium, or other Kubernetes ingress/networking tools
What We Bring
Mission driven company
Competitive Salary
Stock Options
100% paid Medical, Dental, and Vision insurance
Flexible PTO
Paid Holidays
401(k)
Parental Leave
Flexible Spending Account
Short Term Disability Insurance
Life and Voluntary Supplemental Insurance
Mental Health Benefits through Spring Health
We’re looking for resilient, adaptable people to join our team, people who believe in the mission and think at massive scale. The solutions that worked on a handful of devices will not work at Exascale. Be prepared to be pushed daily, to learn a lot, and literally build the future.
Tensorwave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, national origin, or veteran status.
Top Skills
What We Do
TensorWave is a cutting-edge cloud platform designed specifically for AI workloads. Offering AMD MI300X accelerators and a best-in-class inference engine, TensorWave is a top-choice for training, fine-tuning, and inference. Visit tensorwave.com to learn more.
Send us a message to try it for free.









