About Us
At Union, we are solving one of the hardest challenges in AI infrastructure today: enabling high-velocity iteration while maintaining seamless production-readiness for AI workloads at scale. Flyte, the open-source project we steward, is the emerging standard for modern data and AI orchestration, with numerous leading technology organizations - like LinkedIn, Spotify, and Gojek - running millions of mission-critical workloads on the platform. We have a deep bench of infrastructure veterans from companies in the Big Three and beyond and a technical founding team who originally created Flyte while at Lyft.
The Opportunity
Reporting into the Head of Engineering, we are currently seeking a highly technical, versatile Distributed Systems Engineer with 10+ years of professional experience building, designing and implementing services and solutions to streamline delivery, installation, and orchestration of data and services in a large-scale AI/ML platform based on the Flyte orchestration framework. Successful candidates will have a broad understanding of multiple cloud vendors, Kubernetes, API design, high-volume, low-latency systems, and thrive in a fast paced environment. We value individuals who enjoy tackling challenges head-on, can communicate effectively across technical and non-technical teams, have a knack for creative problem-solving, and can balance short-term priorities with long-term goals—both as a hands-on technical partner across the organization and as a leader building a high-performing engineering team.
In this role, you will:
Design and build distributed systems backend services (APIs, Kubernetes controllers, etc) and client components to install, manage, and observe Union services in a Kubernetes native environment.
Lead, mentor, and foster the professional growth of a high-performing, collaborative engineering team through effective coaching and guidance.
Design, implement, and optimize distribution strategies to facilitate simple and intuitive management of a complex platform in customer controlled environments.
Work across multiple cloud vendors including AWS, GCP, Azure, and OCI as well as neo-cloud providers.
Develop and maintain services and tooling to make our systems more reliable, secure, and performant.
Contribute to architectural decisions and participate in code and design reviews across various teams, ensuring the highest standards of quality and performance.
Work closely with broader teams including Backend, Frontend, and Support to improve the experience for our customers.
Frontend expertise is a bonus.
You will be expected to be in-office.
About you:
Have 10+ years of experience in deeply technical roles in engineering functions.
3-4 years of professional experience leading, managing, growing and coaching a team of engineers.
Have a deep passion for all things Kubernetes and the broader container orchestration ecosystem.
Can navigate and pick up new technologies quickly.
Always think about the big picture and can put yourself in the shoes of the developer and customer.
You have hands-on experience with backend programming languages (Go, Rust, Python).
Can own complex projects from planning to completion.
Bonus: You have a general understanding of building modern web applications using Next.js, React, and Typescript.
You can expect to work with the following tools at Union, however, we’re constantly evolving our stack!
Languages: Golang, Rust, Python
Infrastructure: AWS, GCP, Azure, OCI, Kubernetes
CI/CD: Buildkite, ArgoCD, Terraform, Helm
Benefits & Belonging
At Union.ai we know that employees who feel their best can build amazing things and we are proud to offer best in class benefits that will continually evolve and grow as the needs of our employees do. Benefits may vary based on country
Excellent medical - We pay 100% of your premiums and 90% for your dependents
Generous dental and vision plans- We pay 90% of the premiums for you and your dependents
Meaningful equity in the form of options – all employees are owners here
Unlimited time off + 12 company holidays
401K match - Union.ai matches 100% of contributions up to the first 3%, and 50% up to 5%
16 weeks paid parental leave for primary and secondary caregivers
Flexible work schedule (some restrictions apply)
For in office employees: Lunch provided onsite and well stocked kitchen with snacks and drinks.
We believe that our differences are what bring us together to achieve truly special outcomes. We strive to be inclusive and focus on building teams that embody that quality too. Union.ai is an equal-opportunity employer and we encourage you to apply, even if your experience doesn’t align exactly with our job description.
Top Skills
What We Do
Join a community, not just a team
Union.ai is the company behind Flyte, the leading machine learning orchestration ecosystem based on Kubernetes. Flyte is deployed in production now at Lyft, Spotify, Gojek and dozens of other companies across various industries.
Why Work With Us
When you join us, you join the community! At Union.ai we care deeply about our open source developer community and recognize the importance of nurturing it as we continue to scale. We want our employees to feel empowered to continually put our customers first as we set out to build scalable infrastructure to support our users.
Gallery
