BrightHire is a category-creating, high-growth, Series B software company with a mission to give everyone the hiring experience they deserve.
We deliver on this mission by transforming the way many of the world’s leading companies build exceptional teams. We created the Interview Intelligence category, and our clients include some of the world’s most innovative companies—Canva, OpenAI, Ramp, Hubspot—up to the Fortune 500.
Remote - USA
About the RoleYou will partner closely with our Engineers, Product, and Design to productionize early-stage AI features into high quality, performant AI features that delight users at scale. Your focus will be on quality and safety testing: devising rigorous evaluation frameworks, refining prompts and pipelines, and optimizing model choices for cost, latency, accuracy, tone, and safety. You will help build the shared AI platform that powers products such as:
- AI Interviewer conversation loops that adapt in real time
- AI Fraud Signals that flag suspicious behavior with minimal false positives
- AI Candidate skills matrices and assistants that surface instant insights
- Design and own comprehensive evaluations that measure accuracy, completeness, style, hallucination rate, bias, and safety across every release.
- Tune and iterate on RAG pipelines, prompt chains, conversation loops, provider selections, and fine-tunes until quality bars are met or exceeded.
- Build reusable data and evaluation pipelines, a shared semantic layer, and monitoring dashboards that make it easy for product teams to ship reliable AI quickly.
- Optimize for cost and latency, continuously benchmarking models and negotiating trade-offs between performance and spend.
- Implement robust data governance and lineage practices that satisfy enterprise compliance requirements and support our AI bias audit process.
- Document best practices and share knowledge to raise the bar for AI development across BrightHire.
- 5+ years in Data Science or ML engineering with a strong focus on ML or NLP systems.
- 1+ year focused on Gen-AI or LLM systems.
- Strong Python and SQL skills.
- Experience creating automated evaluation suites for LLM outputs (accuracy, safety, bias, tone, style) and using results to guide iterative improvements.
- Knowledge of prompt engineering, RAG techniques, vector search, embeddings, fine-tuning, and model selection across multiple providers.
- Ability to communicate complex AI trade-offs clearly to engineers, designers, and executives alike
- Bias toward action, curiosity, and a passion for building high-quality user experiences
- You’ll have the opportunity to work on high-impact projects in small, autonomous squads, with the flexibility to lead initiatives or focus as an individual contributor depending on your goals and interests.
- Our developer experience is thoughtfully designed, with fast CI (< 10 minutes), 1-click deploys, strong observability, and a clean codebase that enables you to move quickly and confidently.
- Our culture supports sustainable, focused work with fully remote roles, regular working hours, no-meeting Wednesdays, and flexible time off to recharge when needed.
- Our team is composed of smart, collaborative, and genuinely kind people, creating an environment where you can learn, grow, and do your best work.
Our company does not discriminate in employment on the basis of race, color, religion, sex (including pregnancy and gender identity), national origin, political affiliation, sexual orientation, marital status, disability, genetic information, age, membership in an employee organization, retaliation, parental status, military service, or other non-merit factor.
Top Skills
What We Do
BrightHire is the first interview intelligence platform, transforming how the world's fastest growing companies scale by making the hiring process better, faster, more equitable, and above all human.


.png)






