Senior Research Engineer- Optimization

Posted 12 Days Ago
Be an Early Applicant
Palo Alto, CA
175K-250K Annually
5-7 Years Experience
Digital Media
The Role
Luma is seeking a Senior Research Engineer with experience in PyTorch, CUDA, and distributed systems to work on training and scaling up multimodal foundation models for intelligent systems. The ideal candidate will collaborate with Research Scientists to build cutting-edge models on GPUs and have a startup mindset focused on velocity, clear communication, and user-centric product development.
Summary Generated by Built In

We are looking for engineers with significant problem solving experience in PyTorch, CUDA and distributed systems. You will work with Research Scientists to build & train cutting edge foundation models on thousands of GPUs. 

Responsibilities

  • Ensure efficient implementation of models & systems for data processing, training, inference and deployment
  • Identify and implement optimization techniques for massively parallel and distributed systems
  • Identify and remedy efficiency bottlenecks (memory, speed, utilization) by profiling and implementing high-performance CUDA, C++ and PyTorch code
  • Work closely together with the research team to ensure systems are planned to be as efficient as possible from start to finish
  • Build tools to visualize, evaluate and filter datasets
  • Implement cutting-edge product prototypes based on multimodal generative AI

Experience

  • Experience training large models using Python & Pytorch, including practical experience working with the entire development pipeline from data processing, preparation & data loading to training and inference.
  • Experience with profiling CPU & GPU code in PyTorch, including Nvidia Nsight or similar.
  • Experience writing & improving highly parallel & distributed PyTorch code, with familiarity in DDP, FSDP, Tensor Parallel, etc.
  • Experience writing high-performance parallel C++. Bonus if done within an ML context with PyTorch, like for data loading, data processing, inference code.
  • Experience with high-performance CUDA and writing custom PyTorch kernels. Top candidates will be able to utilize tensor cores; optimize performance with CUDA memory and other similar skills.
  • Good to have experience working with Deep learning concepts such as Transformers & Multimodal Generative models such as Diffusion Models and GANs.
  • Good to have experience building inference / demo prototype code (incl. Gradio, Docker etc.)
  • Please note this role is not meant for recent grads.

Your applications are reviewed by real people.

Top Skills

Cuda
Python
PyTorch
The Company
Minneapolis, MN
0 Employees
On-site Workplace

What We Do

Luma is a multimedia platform that delivers personalized movie and TV program selections from a range of sources to its viewers.

Jobs at Similar Companies

Artlist Logo Artlist

Brand & Marketing Designer

Digital Media • Music • Other • Social Media
IL
450 Employees

Effectv Logo Effectv

Advertising Operations Analyst- Digital

AdTech • Digital Media • Marketing Tech
Remote
Pennsylvania, USA
2157 Employees

JuiceMedia.AI Logo JuiceMedia.AI

Business Development Manager - Mobile applications

AdTech • Agency • Digital Media • Machine Learning • Marketing Tech • Analytics • Big Data Analytics
Hybrid
Marina del Rey, CA, USA
50 Employees
102K-167K Annually

Similar Companies Hiring

JuiceMedia.AI Thumbnail
Marketing Tech • Machine Learning • Digital Media • Big Data Analytics • Analytics • Agency • AdTech
Marina Del Rey, CA
50 Employees
Effectv Thumbnail
Marketing Tech • Digital Media • AdTech
New York, NY
2157 Employees
Artlist Thumbnail
Social Media • Other • Music • Digital Media
Tel Aviv, IL
450 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account