The Role
Design and build core backend AI systems and components for a multi-cloud execution platform. Enhance support for AI and batch workloads, collaborate with users and open-source community, and contribute public-facing content.
Summary Generated by Built In
SkyPilot is building the future of multicloud AI infra. We are the Berkeley founding team commercializing SkyPilot (9.5K+ GitHub stars, 200+contributors), to enable AI to run on different cloud infrastructures in a portable, cost-optimizing, and highly available way.
SkyPilot is deployed at 100s of companies, including Fortune 500s and top AI-natives (Shopify, Redis, Abridge, Hippocratic, Applied Compute, etc.). In 2025, adoption grew >600%, now launching more GPUs per month than the biggest neocloud’s fleet. Currently in stealth, SkyPilot is founded in 2024 by UC Berkeley PhDs and professors (incl. Databricks cofounders). We’re building a top-tier engineering team, with current talent from Databricks, Google, Crusoe, ByteDance, and PingCap.
You will play a crucial role in shaping the future of Sky Computing and AI infrastructure:
- Design and build core AI systems in the Sky Computing vision to make SkyPilot the standard solution in multi-cloud, any-cloud execution for AI.
- Build enhancements and new components to evolve SkyPilot with better support of a wide range of AI and batch workloads.
- Engage with users: Opportunity to work closely with our users and customers to make their use cases successful; to grow our open-source community; to gain visibility for your work via public tutorials, blog posts, and/or talks.
Ideal Candidates
- Strong systems background: 3+ years of industry experience in backend engineering (YOE can be relaxed for exceptional candidates). Bonus: Designed and/or implemented impactful infra platforms & cloud/distributed systems.
- Experience with cloud infra technologies: e.g., gRPC, Protobuf, AWS EC2 / GCP GCE / Azure, object storage, cloud networking, Kubernetes, Terraform, load balancers.
- Experience with Python/Go, or other systems programming languages.
- Bonus: Familiarity with GenAI / DL / ML workloads or related infra frameworks (e.g., Kueue, KAI, KServe).
- Passion for building the future of AI infra and cloud computing.
What We Offer
- Competitive equity and compensation.
- Chance to work with some of the best minds in cloud, distributed, and AI systems.
- Front-row seat at the latest open-source infra startup from Berkeley (prev: Databricks, Anyscale).
Skills Required
- 3+ years industry experience in backend engineering
- Experience with cloud infrastructure technologies (gRPC, Protobuf, AWS EC2, GCP GCE, Azure, object storage, cloud networking, Kubernetes, Terraform, load balancers)
- Experience with Python, Go, or other systems programming languages
- Designed and/or implemented infra platforms and cloud/distributed systems
- Familiarity with GenAI/DL/ML workloads or infra frameworks (Kueue, KAI, KServe)
Am I A Good Fit?
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.
Success! Refresh the page to see how your skills align with this role.
The Company
What We Do
SkyPilot is an open-source framework designed to run, manage, and scale AI, machine learning, and data science workloads on any AI infrastructure. It provides teams with a unified interface to launch jobs across various clouds and regions, automating compute selection and cost optimization to reduce expenses and simplify the management of complex cloud resources without requiring deep infrastructure expertise.

.png)





