PyTorch & MLOps AI Specialist

Posted 3 Days Ago
Hiring Remotely in United States
Remote
70-110 Hourly
Junior
Artificial Intelligence • HR Tech • Professional Services • Software
The Role
Contribute to generative AI model training and evaluation by designing and solving ML infrastructure and systems challenges. Build and optimize distributed training, custom GPU kernels, evaluation frameworks, and provide technical reviews and feedback to improve training data and model capabilities.
Summary Generated by Built In

This role is for one of our clients

Compensation: $70-$110 per hour

Join a leading AI lab's cutting-edge Generative AI team and play a key role in developing next-generation large language models. We are seeking experienced MLOps and ML Systems Engineers with deep expertise in PyTorch and kernel-level programming frameworks such as Triton or Pallas.

In this role, you will contribute to AI model training and evaluation initiatives by designing, solving, and reviewing advanced machine learning infrastructure and systems challenges. Your expertise will help improve the quality of training data used to develop frontier AI systems.

This is a full-time (40 hours/week) engagement supporting high-impact AI research and engineering efforts.


RequirementsKey Responsibilities
  • Partner with research and engineering teams to identify and address knowledge gaps in MLOps, machine learning infrastructure, and model training systems.
  • Design challenging, real-world tasks focused on distributed training, ML frameworks, model optimization, and infrastructure engineering.
  • Develop accurate, well-structured solutions to complex MLOps and ML systems problems.
  • Evaluate technical tasks and solutions, providing detailed and actionable feedback.
  • Create evaluation frameworks and scoring rubrics for training pipeline architecture, distributed systems reasoning, performance optimization, and kernel-level programming.
  • Contribute domain expertise to improve AI model capabilities in machine learning engineering topics.
  • Collaborate with other subject matter experts to ensure consistency, quality, and technical accuracy across datasets and evaluations.
Required Qualifications
  • 2+ years of professional experience in ML Infrastructure, MLOps, ML Systems Engineering, or a closely related field.
  • Strong hands-on experience building and operating production-scale machine learning systems.
  • Advanced proficiency with PyTorch, including model training, optimization, and deployment workflows.
  • Experience developing, tuning, or optimizing custom GPU kernels using Triton, Pallas, or similar frameworks.
  • Demonstrated career growth and increasing technical responsibility.
  • Ability to commit to a full-time, 40-hour-per-week schedule during standard business days.
  • Excellent written communication skills and the ability to clearly explain complex technical concepts and engineering decisions.
Preferred Qualifications
  • Experience with large-scale distributed training frameworks and infrastructure.
  • Knowledge of GPU performance optimization and compiler-level ML tooling.
  • Familiarity with modern AI training pipelines, model evaluation methodologies, and LLM development workflows.
  • Experience mentoring engineers or contributing to technical standards and best practices.
  • Background in cloud-native ML infrastructure and production deployment environments.
Why Join
  • Work alongside leading AI researchers and engineers on frontier AI systems.
  • Influence the development and evaluation of next-generation large language models.
  • Apply your expertise to solve challenging machine learning infrastructure and optimization problems.
  • Contribute to high-impact projects at the forefront of AI innovation.
Additional Information
  • Full-time engagement requiring 40 hours per week.
  • Dedicated commitment is expected during the engagement period.
  • Responsibilities and project scope may evolve based on research priorities and business needs.
Equal Opportunity Statement

All qualified applicants will be considered without regard to legally protected characteristics. Reasonable accommodations are available upon request.

Skills Required

  • 2+ years of professional experience in ML Infrastructure, MLOps, ML Systems Engineering, or a closely related field.
  • Hands-on experience building and operating production-scale machine learning systems.
  • Advanced proficiency with PyTorch, including model training, optimization, and deployment workflows.
  • Experience developing, tuning, or optimizing custom GPU kernels using Triton, Pallas, or similar frameworks.
  • Ability to commit to a full-time, 40-hour-per-week schedule during standard business days.
  • Excellent written communication skills and ability to explain complex technical concepts and engineering decisions.
  • Demonstrated career growth and increasing technical responsibility.
  • Experience with large-scale distributed training frameworks and infrastructure.
  • Knowledge of GPU performance optimization and compiler-level ML tooling.
  • Familiarity with modern AI training pipelines, model evaluation methodologies, and LLM development workflows.
  • Experience mentoring engineers or contributing to technical standards and best practices.
  • Background in cloud-native ML infrastructure and production deployment environments.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Year Founded: 2021

What We Do

Weekday is an AI-powered recruitment platform that helps startups hire top-tier engineering and product talent. By leveraging a massive database of white-collar professionals and advanced outreach tools, the company streamlines the hiring process through automated sourcing, AI-driven resume screening, and white-glove contingency services. Their mission is to modernize recruitment by enabling companies to discover and engage passive candidates efficiently, ensuring high-quality hires for critical roles.

Similar Jobs

Samsara Logo Samsara

Business Technology Engineer II

Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Easy Apply
Remote or Hybrid
United States
4000 Employees
104K-175K Annually

Coinbase Logo Coinbase

Staff Software Engineer

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Easy Apply
Remote
USA
4700 Employees
218K-257K Annually

Coinbase Logo Coinbase

Staff Product Designer

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Easy Apply
Remote
USA
4700 Employees
207K-244K Annually

Coinbase Logo Coinbase

Internal Audit Analyst

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Easy Apply
Remote
USA
4700 Employees
95K-112K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account