Modal provides the infrastructure foundation for AI teams. With instant GPU access, sub-second container startups, and native storage, Modal makes it simple to train models, run batch jobs, and serve low-latency inference. Companies like Suno, Lovable, and Substack rely on Modal to move from prototype to production without the burden of managing infrastructure.
We're a fast-growing team based out of NYC, SF, and Stockholm. We've hit high 8-figure ARR and recently raised a Series B at a $1.1B valuation. We have thousands of customers who rely on us for production AI workloads, including Lovable, Scale AI, Substack, and Suno.
Working at Modal means joining one of the fastest-growing AI infrastructure organizations at an early stage, with many opportunities to grow within the company. Our team includes creators of popular open-source projects (e.g. Seaborn, Luigi), academic researchers, international olympiad medalists, and experienced engineering and product leaders with decades of experience.
The Role:We're looking for strong engineering managers who love leading and mentoring high-performing full-stack engineering teams and have a high degree of customer empathy, product sense and ownership.
Requirements:1+ years of engineering management experience.
4+ years of full-time software engineering experience.
Experience building applications with a modern front-end Javascript framework such as React. Prior experience in Svelte is nice to have, but not required.
Experience architecting and scaling modern web infrastructure.
Strong product sense and experience driving product outcomes.
Strong communication skills and a desire to partner with our customers in solving their problems.
Ability to partner closely with product design to craft delightful user experiences.
Ability to work in-person in our NYC office.
Top Skills
What We Do
Deploy generative AI models, large-scale batch jobs, job queues, and more on Modal's platform. We help data science and machine learning teams accelerate development, reduce costs, and effortlessly scale workloads across thousands of CPUs and GPUs.
Our pay-per-use model ensures you're billed only for actual compute time, down to the CPU cycle. No more wasted resources or idle costs—just efficient, scalable computing power when you need it.