Senior Machine Learning Scientist

Posted 2 Days Ago
Be an Early Applicant
27 Locations
Remote
Senior level
Fitness • Software • App development
The Role
Lead end-to-end ML work: define product problems as ML tasks, prototype and ship models (ranking, retrieval, LLM agents), build evaluation and monitoring, and measure marketplace impact (bookings, retention, GMV). Collaborate with engineering, product, and growth to deploy production models and improve LLM reliability.
Summary Generated by Built In
About Sweatpals

Sweatpals is the community-first fitness platform turning workouts into social experiences. Backed by a16z speedrun, Patron, Kevin Hart, Pear VC, and founders of Instacart and Dreamworks Animations, we connect hundreds of thousands of "pals," hosts, and gyms through events, memberships, and social features. We're still scrappy at heart, but scaling fast.

We believe working out should be joyful, social, and inclusive, not just a solo grind. From run clubs and beach pilates to pickleball leagues and cold plunge socials, Sweatpals turns everyday workouts into meaningful social experiences.

Sweatpals also gives local leaders the tools to grow their fitness communities from side hustles to full-time, even million-dollar businesses. Hosts use our platform to run their business, from ticketing and memberships to marketing tools.

We're an AI-forward company. If you're excited about working at the intersection of ML, product, and a real marketplace, you'll fit right in.

The Role

We're looking for a Senior Machine Learning Scientist to join the AI Squad and own the most ambitious ML work on our roadmap. You'll report to the Head of AI and partner closely with engineering and product to ship models that move our marketplace.

This is a high-ownership role. You'll define problems, run the science, ship to production, and measure real user impact. You won't inherit a graveyard of half-finished notebooks. You'll build the next layer of ML at Sweatpals on top of what we've already shipped: semantic search, event tagging, collection ranking, retention models, and our LLM-powered HostCopilot and Front Desk Agent.

You'll spend your time on problems like:

  • How do we rank events for each Pal so they discover more hosts they'd love?

  • Can we predict churn early enough for HostCopilot to nudge before it happens?

  • What's the right way to price a class or membership to maximize host GMV without hurting bookings?

  • How do we make LLMs reliable enough to draft host campaigns, recommend events, and answer questions at a real front desk?

  • How do we measure if our models actually move the marketplace, not just CTR?

What You'll DoModeling & Research
  • Frame fuzzy product problems as ML problems and pick the right approach: ranking, retrieval, classification, sequence models, LLM agents, or classic stats

  • Run end-to-end: data exploration, offline evaluation, prototype, online experiment, iteration

  • Push to the cutting edge when it matters, stay pragmatic when it doesn't

  • Own offline metrics (NDCG, recall@k, AUC, calibration) and tie them to online metrics (booking lift, retention, GMV)

Production & Shipping
  • Ship models to production with our engineering team. Our ML stack is FastAPI, PostgreSQL, BigQuery, AWS App Runner, with retrieval via FAISS and sentence-transformers, and managed LLM APIs (Claude, Gemini)

  • Build evaluation harnesses and monitoring so we know when models drift

  • Keep latency budgets honest

LLM & Agent Systems
  • Develop LLM-powered features across HostCopilot (drip campaigns, retention nudges, pricing and content suggestions) and Pal-facing surfaces (AI Concierge, semantic search, recommendations)

  • Build agentic systems with tool use, RAG, structured outputs, evaluation loops, and human-in-the-loop where needed

  • Decide when to prompt-engineer, when to fine-tune, and when a classical model is the better answer

Cross-Functional Impact
  • Partner with product to size opportunities and translate findings into roadmap decisions

  • Partner with growth and our data analyst to measure marketplace impact rigorously

  • Set the bar for the squad on ML rigor: offline evaluation, experiment design, and writeups

What We're Looking ForExperience
  • 5+ years of applied ML experience shipping models to production. Bonus if some of that was in marketplaces, search, or recommendations

  • Track record of taking a problem from "vague PM ask" to "shipped feature that moved a metric"

  • Comfort with the full lifecycle: framing, data, modeling, evaluation, deployment, monitoring

Technical Skills
  • Strong Python and SQL. You write production code, not just notebooks

  • Solid foundations in at least one ML area: ranking and recommendation systems, NLP and embeddings, classical ML, LLMs and agents, or causal inference

  • Comfortable with modern LLM tooling: prompting, RAG, evaluation, tool use, structured outputs

  • Practical stats: experiment design, dealing with confounding, knowing when an A/B test is broken

  • Familiarity with our stack is a plus: FastAPI, PostgreSQL, BigQuery, FAISS, sentence-transformers, AWS, Amplitude

  • Advanced degree in ML, CS, stats, or a related field is typical. PhD or research background is a strong bonus

Mindset
  • Product first. You care about user impact more than novelty

  • You use AI tools daily. Claude Code, Cursor, whatever ships faster

  • You write things down. Memos, experiment results, design docs

  • You're comfortable being the second dedicated ML person at the company and pushing the bar up

  • You care about quality and follow through. You don't ship and forget

Why Join
  • Ownership: You'll define the next chapter of ML at Sweatpals, not maintain someone else's models

  • AI-native culture: We use Claude Code daily, ship fast, and treat AI tooling as table stakes

  • Flexibility: Remote-first, async-friendly, EU timezone

  • Compensation: Competitive salary plus early-stage equity

Our Values
  • Celebrate Diversity of Thought: We embrace different backgrounds, opinions, and ways of thinking. We don't just welcome disagreement, we believe it makes the product better.

  • Be a Leader: We take initiative, speak up, and drive things forward, no matter your title. Leadership is a mindset, not a level.

  • Roll Up Our Sleeves: We do what it takes. No job is too small when we're building something big.

  • Embrace Adventure: We stay curious, push boundaries, and see challenges as opportunities. Startups are a rollercoaster and we're here for the ride.

  • Make Excellence the Baseline: We hold a high bar for quality and follow through. Doing great work is the starting point, not the finish line.

We're still early, which means your work will shape our path and our impact on communities and businesses everywhere. If you're excited by challenge, autonomy, and building something that matters, you'll feel at home here.

How to Apply

Send your resume (or LinkedIn) and a short note answering:

  1. One ML system you shipped end-to-end and what it actually changed

  2. A modeling decision you made that turned out to be wrong, and what you learned

  3. A paper, blog post, or repo you keep going back to (optional, but we love this)

Skills Required

  • 5+ years of applied ML experience shipping models to production
  • Track record of taking vague PM asks to shipped features that moved metrics
  • Strong Python
  • Strong SQL
  • Solid foundations in at least one ML area: ranking/recommendations, NLP/embeddings, classical ML, LLMs/agents, or causal inference
  • Comfortable with modern LLM tooling: prompting, RAG, evaluation, tool use, structured outputs
  • Practical statistics and experiment design, ability to detect confounding and broken A/B tests
  • Comfortable with full ML lifecycle: framing, data exploration, modeling, deployment, monitoring
  • Familiarity with stack: FastAPI, PostgreSQL, BigQuery, FAISS, sentence-transformers, AWS, Amplitude
  • Experience in marketplaces, search, or recommendations
  • Advanced degree in ML, CS, statistics, or related (MS typical, PhD or research background is a strong bonus)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Austin, Texas
29 Employees
Year Founded: 2022

What We Do

A platform for people to connect, sweat, and inspire.

Similar Jobs

Shelf Logo Shelf

Senior Data Scientist

Information Technology
In-Office or Remote
28 Locations
125 Employees
Easy Apply
Remote
37 Locations
55 Employees
140K-178K Annually

CodePath.org Logo CodePath.org

Senior Product Designer

Edtech • Social Impact
Easy Apply
Remote
37 Locations
55 Employees
148K-190K Annually

Pfizer Logo Pfizer

Senior Director, CFC CRM Lifecycle & Value Lead

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Remote or Hybrid
32 Locations
121990 Employees
215K-358K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account