RESPONSIBILITIES:
- Design, build, and operate production AI systems and scaffolding around language models that power conversational, predictive, and generative capabilities across WHOOP products.
- Lead end-to-end AI system initiatives spanning problem definition, data flows, dataset design, evaluation harnesses, deployment, and iteration in close partnership with data science and product.
- Build and maintain pipelines for collecting, curating, and reshaping messy, multi-source data into high-quality, well-structured training and evaluation datasets for language model–based systems.
- Operationalize fine-tuning and evaluation workflows for large language models behind member-facing features such as WHOOP Coach and AI Support, including defining datasets, labels, and taxonomies that reflect real member needs.
- Develop tooling and frameworks that make experimentation, offline/online evaluation, and model deployment faster, safer, and more repeatable, including robust observability for AI features in production.
- Build and maintain feedback loops that connect real member interactions, offline evaluations, and training data updates so that models improve continuously based on real-world behavior.
- Mentor other engineers and data scientists, share best practices in applied AI/ML, and help elevate the overall technical bar of the AI Platform team.
QUALIFICATIONS:
- 3+ years of experience in applied machine learning, AI engineering, or ML-focused software engineering roles, including significant work in production environments.
- Hands-on experience building with modern language models (open-weight or API-based), including prompt design, fine-tuning, and rigorous evaluation.
- Solid working understanding of ML fundamentals (dataset construction, feature engineering, training workflows, evaluation metrics, experiment design) sufficient to make good engineering tradeoffs and partner effectively with data scientists.
- Familiarity with modern LLM training and alignment techniques such as supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning (RL), and how they influence data requirements, evaluation strategies, and system design in production.
- Proven track record building, shipping, and operating ML-powered systems end to end, from data pipelines (batch and/or streaming) that transform large datasets into usable training and evaluation sets to production deployments with inference optimization, observability, and lifecycle management.
- Strong proficiency in data manipulation and analysis, including working with messy, multi-source, and semi-structured data and translating product questions into well-defined datasets, labels, and evaluation splits.
- Familiarity with best practices for secure, privacy-aware AI and working with sensitive data.
- Excellent communication and collaboration skills, with the ability to influence across teams and drive alignment on technical direction.
Top Skills
What We Do
At WHOOP, we’re on a mission to unlock human performance. WHOOP empowers members to perform at a higher level through a deeper understanding of their bodies and daily lives. Our wearable device and performance optimization platform has been adopted by many of the world's greatest athletes and consumers alike.
Why Work With Us
At WHOOP, we’re focused on building an inclusive and equitable team with a strong sense of belonging for everyone—increasing representation in every way as our team grows. We believe that our differences are our source of strength—so much so it’s one of our core values.
Gallery
WHOOP Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.





