SweatPals

Senior Machine Learning Scientist

Reposted 17 Days Ago

Be an Early Applicant

27 Locations

Remote

Senior level

Fitness • Software • App development

The Role

Lead end-to-end ML work: define product problems as ML tasks, prototype and ship models (ranking, retrieval, LLM agents), build evaluation and monitoring, and measure marketplace impact (bookings, retention, GMV). Collaborate with engineering, product, and growth to deploy production models and improve LLM reliability.

Summary Generated by Built In

About Sweatpals

Sweatpals is the community-first fitness platform turning workouts into social experiences. Backed by a16z speedrun, Patron, Kevin Hart, Pear VC, and founders of Instacart and Dreamworks Animations, we connect hundreds of thousands of "pals," hosts, and gyms through events, memberships, and social features. We're still scrappy at heart, but scaling fast.

We believe working out should be joyful, social, and inclusive, not just a solo grind. From run clubs and beach pilates to pickleball leagues and cold plunge socials, Sweatpals turns everyday workouts into meaningful social experiences.

Sweatpals also gives local leaders the tools to grow their fitness communities from side hustles to full-time, even million-dollar businesses. Hosts use our platform to run their business, from ticketing and memberships to marketing tools.

We're an AI-forward company. If you're excited about working at the intersection of ML, product, and a real marketplace, you'll fit right in.

The Role

We're looking for a Senior Machine Learning Scientist to join the AI Squad and own the most ambitious ML work on our roadmap. You'll report to the Head of AI and partner closely with engineering and product to ship models that move our marketplace.

This is a high-ownership role. You'll define problems, run the science, ship to production, and measure real user impact. You won't inherit a graveyard of half-finished notebooks. You'll build the next layer of ML at Sweatpals on top of what we've already shipped: semantic search, event tagging, collection ranking, retention models, and our LLM-powered HostCopilot and Front Desk Agent.

You'll spend your time on problems like:

How do we rank events for each Pal so they discover more hosts they'd love?
Can we predict churn early enough for HostCopilot to nudge before it happens?
What's the right way to price a class or membership to maximize host GMV without hurting bookings?
How do we make LLMs reliable enough to draft host campaigns, recommend events, and answer questions at a real front desk?
How do we measure if our models actually move the marketplace, not just CTR?

What You'll DoModeling & Research

Frame fuzzy product problems as ML problems and pick the right approach: ranking, retrieval, classification, sequence models, LLM agents, or classic stats
Run end-to-end: data exploration, offline evaluation, prototype, online experiment, iteration
Push to the cutting edge when it matters, stay pragmatic when it doesn't
Own offline metrics (NDCG, recall@k, AUC, calibration) and tie them to online metrics (booking lift, retention, GMV)

Production & Shipping

Ship models to production with our engineering team. Our ML stack is FastAPI, PostgreSQL, BigQuery, AWS App Runner, with retrieval via FAISS and sentence-transformers, and managed LLM APIs (Claude, Gemini)
Build evaluation harnesses and monitoring so we know when models drift
Keep latency budgets honest

LLM & Agent Systems

Develop LLM-powered features across HostCopilot (drip campaigns, retention nudges, pricing and content suggestions) and Pal-facing surfaces (AI Concierge, semantic search, recommendations)
Build agentic systems with tool use, RAG, structured outputs, evaluation loops, and human-in-the-loop where needed
Decide when to prompt-engineer, when to fine-tune, and when a classical model is the better answer

Cross-Functional Impact

Partner with product to size opportunities and translate findings into roadmap decisions
Partner with growth and our data analyst to measure marketplace impact rigorously
Set the bar for the squad on ML rigor: offline evaluation, experiment design, and writeups

What We're Looking ForExperience

5+ years of applied ML experience shipping models to production. Bonus if some of that was in marketplaces, search, or recommendations
Track record of taking a problem from "vague PM ask" to "shipped feature that moved a metric"
Comfort with the full lifecycle: framing, data, modeling, evaluation, deployment, monitoring

Technical Skills

Strong Python and SQL. You write production code, not just notebooks
Solid foundations in at least one ML area: ranking and recommendation systems, NLP and embeddings, classical ML, LLMs and agents, or causal inference
Comfortable with modern LLM tooling: prompting, RAG, evaluation, tool use, structured outputs
Practical stats: experiment design, dealing with confounding, knowing when an A/B test is broken
Familiarity with our stack is a plus: FastAPI, PostgreSQL, BigQuery, FAISS, sentence-transformers, AWS, Amplitude
Advanced degree in ML, CS, stats, or a related field is typical. PhD or research background is a strong bonus

Mindset

Product first. You care about user impact more than novelty
You use AI tools daily. Claude Code, Cursor, whatever ships faster
You write things down. Memos, experiment results, design docs
You're comfortable being the second dedicated ML person at the company and pushing the bar up
You care about quality and follow through. You don't ship and forget

Why Join

Ownership: You'll define the next chapter of ML at Sweatpals, not maintain someone else's models
AI-native culture: We use Claude Code daily, ship fast, and treat AI tooling as table stakes
Flexibility: Remote-first, async-friendly, EU timezone
Compensation: Competitive salary plus early-stage equity

Our Values

Celebrate Diversity of Thought: We embrace different backgrounds, opinions, and ways of thinking. We don't just welcome disagreement, we believe it makes the product better.
Be a Leader: We take initiative, speak up, and drive things forward, no matter your title. Leadership is a mindset, not a level.
Roll Up Our Sleeves: We do what it takes. No job is too small when we're building something big.
Embrace Adventure: We stay curious, push boundaries, and see challenges as opportunities. Startups are a rollercoaster and we're here for the ride.
Make Excellence the Baseline: We hold a high bar for quality and follow through. Doing great work is the starting point, not the finish line.

We're still early, which means your work will shape our path and our impact on communities and businesses everywhere. If you're excited by challenge, autonomy, and building something that matters, you'll feel at home here.

How to Apply

Send your resume (or LinkedIn) and a short note answering:

One ML system you shipped end-to-end and what it actually changed
A modeling decision you made that turned out to be wrong, and what you learned
A paper, blog post, or repo you keep going back to (optional, but we love this)

Skills Required

5+ years of applied ML experience shipping models to production
Track record of taking vague PM asks to shipped features that moved metrics
Strong Python
Strong SQL
Solid foundations in at least one ML area: ranking/recommendations, NLP/embeddings, classical ML, LLMs/agents, or causal inference
Comfortable with modern LLM tooling: prompting, RAG, evaluation, tool use, structured outputs
Practical statistics and experiment design, ability to detect confounding and broken A/B tests
Comfortable with full ML lifecycle: framing, data exploration, modeling, deployment, monitoring
Familiarity with stack: FastAPI, PostgreSQL, BigQuery, FAISS, sentence-transformers, AWS, Amplitude
Experience in marketplaces, search, or recommendations
Advanced degree in ML, CS, statistics, or related (MS typical, PhD or research background is a strong bonus)