Senior AI Performance Engineer

Sorry, this job was removed at 12:10 a.m. (CST) on Tuesday, Jun 10, 2025
Be an Early Applicant
San Francisco, CA
In-Office
Artificial Intelligence • Generative AI
The Role

We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the boundaries of what's possible in video generation.

Role overview:

As a Deep Learning Performance Engineer at Genmo, you will play a critical role in optimizing the performance of our large generative AI models. Your expertise will ensure that our models run efficiently on clusters, leveraging advanced techniques and tools to enhance their performance. This role is perfect for someone with a deep understanding of deep learning performance bottlenecks, kernel optimization, and distributed training strategies.

Key responsibilities:
  • Analyze and optimize the performance of massively parallel and distributed systems

  • Implement and fine-tune distributed training strategies for multi-GPU and multi-node environments

  • Implement high-performance CUDA, Triton, C++ and PyTorch code.

  • Profile model performance and identify bottlenecks using tools like NVIDIA NSight Systems, PyTorch Profiler, and TensorFlow Profiler

  • Develop and maintain benchmarking suites for continuous performance monitoring

Qualifications:
  • Master's or PhD in Computer Science, Electrical Engineering, or a related field

  • 5+ years of experience in optimizing deep learning models, preferably in a production environment

  • Must have

    • Strong programming skills in Python and C++. Experience in training large models using Python & PyTorch and/or TensorFlow including their distributed training frameworks.

    • Proven track record of optimizing large-scale models (10B+ parameters)

    • Deep understanding of GPU architecture and CUDA programming

    • Experience in entire development pipeline from data processing, preparation & data loading to training and inference.

    • Experience optimizing and deploying inference workloads for throughput and latency across the stack (inputs, model inference, outputs, parallel processing etc.)

    • Demonstrated expertise in high-performance computing using NVIDIA Triton and CUDA

    • Demonstrated ability to significantly improve model inference and training speeds through low-level optimizations

  • Ideal candidates will have:

    • Knowledge of distributed inference systems for handling high-volume workloads

    • Strong background in linear algebra, optimization, and machine learning algorithms

    • Experience with generative AI models (GANs, Diffusion Models, Transformers)

    • Knowledge of hardware-aware neural architecture design

    • Experience with high-performance computing (HPC) environments

    • Contributions to relevant open source projects or research publications

Genmo is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. Genmo, Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish.

Similar Jobs

NVIDIA Logo NVIDIA

Artificial Intelligence Engineer

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office or Remote
3 Locations
21960 Employees
184K-357K Annually

SambaNova Systems Logo SambaNova Systems

Senior AI Systems Performance Engineer

Artificial Intelligence • Hardware • Machine Learning • Natural Language Processing • Software • Semiconductor • Generative AI
In-Office
Palo Alto, CA, USA
500 Employees

Hewlett Packard Enterprise Logo Hewlett Packard Enterprise

Sr AI/HPC Applications and Performance Engineer

Artificial Intelligence • Cloud • Information Technology • Consulting
In-Office or Remote
3 Locations
61628 Employees
162K-371K Annually

NVIDIA Logo NVIDIA

Senior Systems Engineer

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office or Remote
Santa Clara, CA, USA
21960 Employees
184K-357K Annually
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
San Francisco, CA
50 Employees

What We Do

Enabling the next billion AI video creators with Genmo

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account