ML Infrastructure Engineer

Job Posted 3 Days Ago Posted 3 Days Ago
Be an Early Applicant
San Francisco, CA
Mid level
Artificial Intelligence • Machine Learning • Generative AI
The Role
The ML Infrastructure Engineer will optimize training throughput for internal frameworks, enhance performance, and support researchers in model development.
Summary Generated by Built In

About the Team

The Runtime team builds the low level framework components to power our ML training systems.  We work on building robust, scalable, high performance components to support our distributed training workloads.  Our priorities are to maximize the productivity of our researchers and our hardware, with the goal of accelerating progress towards AGI.  

About the Role

As a ML Infrastructure Engineer, you will work on improving the training throughput for our internal training framework, while enabling researchers to experiment with new ideas.  This requires good engineering (for example designing, implementing, and optimizing state-of-the-art AI models), writing bug-free machine learning code (surprisingly difficult!), and acquiring deep knowledge of the performance of supercomputers. In all the projects this role pursues, the ultimate goal is to push the field forward.

We’re looking for people who love optimizing performance, understanding distributed systems, and who cannot stand having bugs in their code.  Since our training framework is used for large runs with massive numbers of GPUs, performance improvements here will have a large impact.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

  • Apply the latest techniques in our internal training framework to achieve impressive hardware efficiency for our training runs

  • Profile and optimize our training framework

  • Work with researchers to enable them to develop the next generation of models

You might thrive in this role if you:

  • Have run small scale ML experiments

  • Love figuring out how systems work and continuously come up with ideas for how to make them faster while minimizing complexity and maintenance burden

  • Have strong software engineering skills and are proficient in Python

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. 

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status. 

OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement

For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Top Skills

AI
Machine Learning
Python
Supercomputers
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
224 Employees
On-site Workplace
Year Founded: 2015

What We Do

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. AI is an extremely powerful tool that must be created with safety and human needs at its core. OpenAI is dedicated to putting that alignment of interests first — ahead of profit.

To achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. Our investment in diversity, equity, and inclusion is ongoing, executed through a wide range of initiatives, and championed and supported by leadership.

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Similar Jobs

Snap Inc. Logo Snap Inc.

Staff Software Engineer, ML Infrastructure, 9+ Years of Experience

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Hybrid
4 Locations
5000 Employees
195K-343K Annually

Snap Inc. Logo Snap Inc.

Software Engineer, ML Infrastructure, 6+ Years of Experience

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Hybrid
5 Locations
5000 Employees
178K-313K Annually

Snap Inc. Logo Snap Inc.

Software Engineer, ML Infrastructure, 2+ Years of Experience

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Hybrid
5 Locations
5000 Employees
133K-235K Annually

Snap Inc. Logo Snap Inc.

Principal Software Engineer, Machine Learning Infrastructure

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Hybrid
5 Locations
5000 Employees
235K-414K Annually

Similar Companies Hiring

HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account