Tonic.ai

Machine Learning Engineer (NLP)

Posted 13 Days Ago

7 Locations

In-Office

125K-175K Annually

Mid level

Enterprise Web • Machine Learning • Other • Security • Software

The Fake Data Company

The Role

The Machine Learning Engineer will build production-grade NLP models for data privacy, manage the ML lifecycle, and collaborate across teams.

Summary Generated by Built In

Tonic.ai is looking for a hands-on Machine Learning Engineer to help build production-grade NLP systems that power our data privacy and information extraction products. You'll join a small, experienced team working at the intersection of LLMs, data privacy, and applied AI — developing and fine-tuning models that detect and redact sensitive information across diverse datasets.

What You’ll Do

Build and ship models. Fine-tune and evaluate transformer-based models (e.g., RoBERTa, Gemma, LLaMA) to support PII redaction, entity extraction, and synthetic data generation.
Own the ML lifecycle. From dataset curation and experiment tracking to model deployment and monitoring — you’ll own the full path from prototype to production.
Collaborate cross-functionally. Partner with Product and Design to shape how ML models drive user-facing features, and work with the broader engineering team to integrate them into scalable systems.
Experiment responsibly. Document your experiments, evaluate results rigorously, and help push the frontier of safe and explainable AI for data privacy.

What You’ll Bring

3+ years of professional experience in applied ML or data science with a focus on NLP
Proficiency in Python and deep learning frameworks such as PyTorch and Hugging Face Transformers
Hands-on experience with experiment tracking (e.g., Weights & Biases), distributed training (e.g., Accelerate), and model serving (e.g., vLLM)
Comfort working independently and iterating quickly — you enjoy the mix of research, engineering, and product thinking
Strong communication and collaboration skills

Bonus Points For:

Experience with supervised and reinforcement learning fine-tuning (e.g. TRL)
Familiarity with data privacy, PII redaction, or healthcare data
A public portfolio, blog, or open-source contributions that demonstrate your technical depth and curiosity

Why You’ll Love It Here

High autonomy and meaningful ownership — your models will ship to production, not sit in a notebook
Small, collaborative team with deep expertise in NLP and privacy
Opportunity to work with real-world, high-impact data in domains like healthcare and financial services

Benefits

Competitive salary and company equity
Unlimited PTO and generous parental leave
Medical, dental, and vision insurance
401(k) with employer contribution
Remote-friendly work environment

About Tonic.ai

Tonic.ai creates safe, high-quality synthetic data that helps developers move fast while protecting sensitive information. Thousands of engineers rely on Tonic-generated data daily to power development, testing, and CI/CD pipelines across industries including healthcare, financial services, logistics, and education. We’re growing fast and looking for builders who want to make privacy practical.

Top Skills

Accelerate

Hugging Face Transformers

Python

PyTorch

Vllm

Weights & Biases

View all jobs at Tonic.ai

View Tonic.ai Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: San Francisco, California

73 Employees

Year Founded: 2018

What We Do

Tonic.ai frees developers to build with safe, high-fidelity synthetic data to accelerate software and AI innovation while protecting data privacy. Through industry-leading solutions for data synthesis, de-identification, and subsetting, our products enable on-demand access to realistic structured, semi-structured, and unstructured data for software development, testing, and AI model training. The product suite includes Tonic Fabricate for AI-powered synthetic data from scratch, Tonic Structural for modern test data management, and Tonic Textual for unstructured data redaction and synthesis. Unblock innovation, eliminate collisions in testing, accelerate your engineering velocity, and ship better products, all while safeguarding data privacy.

Why Work With Us

We wholeheartedly believe that data privacy is a human right. It isn’t just about complying with the latest regulation. It’s about helping organizations treat data the way we’d like our own data to be treated. Enabling developers along the way is what makes our work a win-win.