Tonic.ai is looking for a hands-on Machine Learning Engineer to help build production-grade NLP systems that power our data privacy and information extraction products. You'll join a small, experienced team working at the intersection of LLMs, data privacy, and applied AI — developing and fine-tuning models that detect and redact sensitive information across diverse datasets.
What You’ll DoBuild and ship models. Fine-tune and evaluate transformer-based models (e.g., RoBERTa, Gemma, LLaMA) to support PII redaction, entity extraction, and synthetic data generation.
Own the ML lifecycle. From dataset curation and experiment tracking to model deployment and monitoring — you’ll own the full path from prototype to production.
Collaborate cross-functionally. Partner with Product and Design to shape how ML models drive user-facing features, and work with the broader engineering team to integrate them into scalable systems.
Experiment responsibly. Document your experiments, evaluate results rigorously, and help push the frontier of safe and explainable AI for data privacy.
3+ years of professional experience in applied ML or data science with a focus on NLP
Proficiency in Python and deep learning frameworks such as PyTorch and Hugging Face Transformers
Hands-on experience with experiment tracking (e.g., Weights & Biases), distributed training (e.g., Accelerate), and model serving (e.g., vLLM)
Comfort working independently and iterating quickly — you enjoy the mix of research, engineering, and product thinking
Strong communication and collaboration skills
Experience with supervised and reinforcement learning fine-tuning (e.g. TRL)
Familiarity with data privacy, PII redaction, or healthcare data
A public portfolio, blog, or open-source contributions that demonstrate your technical depth and curiosity
High autonomy and meaningful ownership — your models will ship to production, not sit in a notebook
Small, collaborative team with deep expertise in NLP and privacy
Opportunity to work with real-world, high-impact data in domains like healthcare and financial services
Competitive salary and company equity
Unlimited PTO and generous parental leave
Medical, dental, and vision insurance
401(k) with employer contribution
Remote-friendly work environment
Tonic.ai creates safe, high-quality synthetic data that helps developers move fast while protecting sensitive information. Thousands of engineers rely on Tonic-generated data daily to power development, testing, and CI/CD pipelines across industries including healthcare, financial services, logistics, and education. We’re growing fast and looking for builders who want to make privacy practical.
Top Skills
What We Do
Tonic.ai frees developers to build with safe, high-fidelity synthetic data to accelerate software and AI innovation while protecting data privacy. Through industry-leading solutions for data synthesis, de-identification, and subsetting, our products enable on-demand access to realistic structured, semi-structured, and unstructured data for software development, testing, and AI model training. The product suite includes Tonic Fabricate for AI-powered synthetic data from scratch, Tonic Structural for modern test data management, and Tonic Textual for unstructured data redaction and synthesis. Unblock innovation, eliminate collisions in testing, accelerate your engineering velocity, and ship better products, all while safeguarding data privacy.
Why Work With Us
We wholeheartedly believe that data privacy is a human right. It isn’t just about complying with the latest regulation. It’s about helping organizations treat data the way we’d like our own data to be treated. Enabling developers along the way is what makes our work a win-win.
Gallery
