Annotation operations: Stand up and scale workflows (task specs, queues, SLAs, QC gates, audits).
Gold/rubric programs: Build gold sets, hidden gold, calibration tests; drive rubric versioning and clear decision trees.
LLM evaluations: Design safety/quality eval suites (multi-turn, multilingual, multimodal); create red-team prompts; wire into CI for regressions.
Vendor management: Manage BPO/annotation vendors; capacity plans, and corrective actions.
Currently pursuing a Bachelor’s or Master’s degree in a relevant field (e.g. Data Science, Operations, Cognitive Science, or related).
Strong analytical and problem-solving skills with attention to detail.
Interest in Trust & Safety, AI/ML operations, or data quality.
Excellent communication and documentation skills.
Familiarity with annotation tools, labeling workflows, or quality control systems.
Exposure to LLMs, prompt engineering, or evaluation methods.
Experience working with vendors or cross-functional teams.
Reinforce Labs is a small team with a clear mission: Shape a future where online spaces are safe for everyone.
We value
Willingness to learn new things, take on diverse tasks, and challenge yourself
Hands-on technical excellence
Ownership
Ability to handle ambiguity
Fast, results-driven delivery
Similar Jobs
What We Do
Making the internet a safer place using AI.








