We're seeking experienced Machine Learning Engineers and Software Engineers with ML experience to design and build high-quality RL training environments for LLM agents. As an RL Environment Engineer, you'll create diverse machine learning tasks that challenge and improve language models, working with minimal supervision to deliver consistent, quality outputs.
What You'll DoDesign and build tasks for machine learning domains that target specific language models and difficulty distributions
Iterate rapidly on task designs based on customer feedback, with 24-hour turnaround times
Create diverse, challenging scenarios that test language model capabilities and expose their limitations
Hit the ground running with minimal onboarding time
Strong machine learning background through coursework, previous work experience, or personal projects
Python fluency: you write clean, efficient Python code regularly
Heavy LLM user who understands current model capabilities and failure modes through daily hands-on experience
Self-directed and creative. You can generate novel ML task ideas in your domain without constant guidance
High responsibility and integrity. You deliver quality work consistently and meet deadlines
Availability overlap with PST 9am-5pm (minimum 3 hours required)
Location: Remote
Type: Contractor
Time Commitment: 40 hours a week. Must have at least 3 hours of overlap with PST business hours (9am-5pm)
Selection Process:Screening
Hacker rank assessment
1 Week paid task
Full time
Skills Required
- Experience as a Machine Learning Engineer or Software Engineer with ML experience
- Strong machine learning background via coursework, work experience, or personal projects
- Fluency in Python, writing clean and efficient Python code
- Daily hands-on use and deep understanding of Large Language Models and their failure modes
- Experience designing and building reinforcement learning (RL) training environments and tasks
- Self-directed, creative ability to generate novel ML task ideas with minimal guidance
- High responsibility, integrity, and consistent on-time delivery of quality work
- Availability overlap with PST business hours (minimum 3 hours overlap)
- Ability to work 40 hours per week as a contractor
- Ability to iterate rapidly on task designs with quick turnaround (e.g., 24-hour turnaround)
What We Do
Careerflow.ai is an AI-powered career management platform and 'career copilot' dedicated to helping job seekers land their dream jobs. The company provides a comprehensive end-to-end toolkit featuring an AI resume builder, LinkedIn profile optimizer, and job tracking tools. By streamlining the application process and optimizing professional profiles, Careerflow helps users navigate the competitive job market and get hired at top tech and startup companies faster.









