Software Engineer, Distributed Systems

Reposted 8 Days Ago
Be an Early Applicant
San Francisco, CA, USA
In-Office
180K-250K Annually
Expert/Leader
Cloud • Digital Media • Information Technology
Generative media platform for developers.
The Role
As a Staff Software Engineer, you will develop and maintain a core Python platform for managing computation workloads and cloud infrastructure, while ensuring system reliability and scalability.
Summary Generated by Built In

fal is the generative media ecosystem powering the next generation of AI products. We build the infrastructure, tools, and model access that teams need to move from idea to production, and do it at scale without compromise. For developers and enterprises, fal is the foundation that makes generative media not just possible, but practical: a unified platform where high-performance inference, orchestration, and observability come together to unlock new categories of AI-native products.

As generative media reshapes industries across a market projected to grow by hundreds of billions over the next decade, fal is becoming the ecosystem that ambitious teams build on.

About this role: 

You are an experienced software engineer who thrives on building large-scale computing platforms. You have deep expertise in large scale distributed systems that deal with high complexity, a lot of traffic and data. You know how to achieve reliability and scale with minimum operational load.

Key responsibilities
  • Build our core Python/Rust platform: request routing, AI workload orchestration, scheduling, GPU autoscaling, large scale file storage, queueing, etc
  • Produce forward designs for platform evolution as we scale to 100x current traffic and need to provide low latency across the world
  • Leverage AI to an extreme level to automate the mundane parts of building complex but reliable systems
  • Profile and tune low level CPU and memory performance
Requirements
  • 3+ years experience building distributed compute and orchestration platforms in Python or Rust
  • Strong understanding of distributed systems fundamentals: consensus, scheduling, fault tolerance, capacity planning
  • Deep understanding of computational complexity and memory allocation
  • Track record of designing systems that scale under real production load
  • Experience building and using observability to drive performance and reliability decisions
  • Excellent communication and ability to drive technical decisions across teams
  • Self-starter who executes quickly, takes ownership, and constantly seeks improvement
Nice to have
  • Experience with AI/ML inference or training infrastructure
  • Experience with high-performance systems programming (async runtimes, zero-copy, memory-safe concurrency)
  • Background in building multi-tenant compute platforms
  • Understanding of networking fundamentals and performance characteristics
  • Familiarity with GPU workload characteristics and scheduling constraints
Compensation
  • $180,000-250,000 plus equity + benefits (This range is across all 3 levels Mid, Senior and Staff)
Location
  • San Francisco, CA (willing to consider remote for Senior and Staff levels)

What we offer at fal
  • Interesting and challenging work

  • A lot of learning and growth opportunities

  • We are currently hiring in downtown San Francisco.

  • We offer relocation assistance to San Francisco.

  • Health, dental, and vision insurance (US)

  • Regular team events and offsites

Skills Required

  • Deep experience building distributed compute platforms, preferably with Python
  • Strong foundation in managing both cloud and bare metal infrastructure
  • Solid understanding of K8s and CI/CD on it
  • Excellent communication
  • Self-starter who executes quickly, takes ownership and constantly seeks improvement
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
73 Employees

What We Do

Generative Media Cloud

Similar Jobs

Capital One Logo Capital One

Lead Software Engineer

Fintech • Machine Learning • Payments • Software • Financial Services
Hybrid
5 Locations
55000 Employees
230K-286K Annually

Capital One Logo Capital One

Lead Software Engineer

Fintech • Machine Learning • Payments • Software • Financial Services
Hybrid
5 Locations
55000 Employees
209K-286K Annually

Alchemy Logo Alchemy

Software Engineer

Blockchain • Cloud • Fintech • Information Technology • Software • Cryptocurrency • Web3
Hybrid
San Francisco, CA, USA
250 Employees
135K-250K Annually

Gamma (gamma.app) Logo Gamma (gamma.app)

Software Engineer

Artificial Intelligence • Software
In-Office
San Francisco, CA, USA
88 Employees
180K-275K Annually

Similar Companies Hiring

Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account