Principal Machine Learning Engineer, Mobile AI Inference Optimization

Posted 22 Days Ago
Be an Early Applicant
Mountain View, CA, USA
Hybrid
278K-348K Annually
Senior level
AdTech • Artificial Intelligence • Gaming • Machine Learning • Software • Virtual Reality • Metaverse
Unity is the leading platform to create and grow games and interactive experiences.
The Role
The Principal Machine Learning Engineer leads the deployment of multi-modal AI models on mobile platforms, setting technical vision and optimizing inference performance, while mentoring a team and collaborating cross-functionally.
Summary Generated by Built In

The opportunity
We are building the next generation of mobile game AI experiences, deploying world models to mobile on-device. As our Principal Machine Learning Engineer, you will be the foremost technical authority on bringing state-of-the-art multi-modal models (transformers, diffusion networks, and JAPE-style architectures) from research to production on mobile hardware.

This is a deeply hands-on, high-impact role. You will define the inference strategy, drive architectural decisions across the full mobile ML stack, and mentor a team of senior and mid-level engineers. Your work will directly determine the latency, quality, and power profile of AI-driven features experienced by billions of mobile game players.

What you'll be doing

  • Technical Leadership:
  • Set the technical vision and roadmap for deploying multi-modal AI models to iOS and Android, spanning transformers, diffusion models, and JAPE-style generative architectures.
  • Make authoritative decisions on model compression, quantization, pruning, and knowledge distillation strategies to meet mobile latency and memory budgets.
  • Evaluate and select inference runtimes (e.g., CoreML, ONNX Runtime Mobile, TFLite, ExecuTorch) and drive adoption across the team.
  • Own the end-to-end optimization pipeline: from model export and graph transformation to hardware-specific kernel tuning on NPU, GPU, and CPU.
  • Architecture & Research Translation:
  • Collaborate directly with research scientists to translate novel model architectures into deployable, mobile-optimized implementations.
  • Design scalable systems for multi-modal inference that process diverse inputs — images, text, primitives, and metadata — and produce pixel-level outputs with real-time performance.
  • Pioneer new approaches to dynamic resolution, token reduction, and speculative decoding tailored to mobile constraints.
  • Track and rapidly adopt breakthroughs in efficient diffusion (e.g., consistency models, flow matching) and efficient attention (e.g., FlashAttention, linear attention variants).
  • Team & Cross-Functional Leadership:
  • Lead and mentor a team of ML engineers; define engineering best practices, code review standards, and on-device benchmarking methodology.
  • Partner with platform engineers, product managers, and runtime teams to align ML capabilities with device SKU constraints and product roadmaps.
  • Champion a culture of measurement: define KPIs for latency, accuracy, memory, and power consumption and ensure the team tracks them rigorously.

What we're looking for

  • 8+ years in ML engineering, with at least 3 years focused on on-device / edge inference optimization.
  • Proven production deployment of transformer-based models (e.g., ViT, LLaMA, Stable Diffusion) and/or JAPE-style generative architectures on mobile or embedded hardware.
  • Hands-on expertise with CoreML, TFLite, ONNX Runtime, and/or ExecuTorch; deep understanding of operator fusion, memory layout, and runtime scheduling.
  • Expert-level command of INT8/INT4/FP16 quantization, weight sharing, structured/unstructured pruning, and knowledge distillation.
  • Strong understanding of mobile SoC architectures (Apple Neural Engine, Qualcomm Hexagon/Adreno, ARM Mali) and how to target each for peak throughput.
  • Proficiency in C++ / Objective-C / Swift for runtime integration; solid Python for training-side tooling and export pipelines.
  • Ability to read, implement, and extend ML research papers; familiarity with efficient attention, diffusion samplers, and multi-modal fusion techniques.
  • Track record of technical leadership: setting direction, influencing cross-functional partners, and growing engineers.

You might also have

  • Experience shipping world-model or neural rendering pipelines (NeRF, 3DGS, or similar) on mobile.
  • Contributions to open-source ML inference frameworks or mobile ML research publications.
  • Familiarity with compiler stacks such as MLIR, TVM, or XLA for custom kernel generation.
  • Background in real-time graphics or game engine pipelines (Metal, Vulkan, OpenGL ES).

Additional information

  • International relocation support is not available for this position

Benefits
At Unity, we want our team members to thrive. We offer a wide range of benefits designed to support well-being and work-life balance.

Please note: Benefits eligibility, specific offerings, and coverage vary based on the country and employment status.

While specific benefits vary, here are some of the ways we strive to take care of our eligible team members globally: Comprehensive health, life, and disability insurance | Commute subsidy | Employee stock ownership | Competitive retirement/pension plans | Generous vacation and personal days | Support for new parents through leave and family-care programs | Office food snacks | Mental Health and Wellbeing programs and support | Employee Resource Groups | Global Employee Assistance Program | Training and development programs | Volunteering and donation matching program

Life at Unity
Unity [NYSE: U] is the world’s leading game engine, powering play for more than 3 billion consumers each month. The top mobile games in the world, the most played PC indie titles, the most innovative console games, and virtually all of the top XR and Web Games are developed, deployed, and grown in Unity. Unity also enables teams across industries like automotive, manufacturing, and healthcare to design, simulate, and collaborate in 3D — closing the gap between ideas and reality. For more information, please visit www.unity.com.

Unity is a proud equal opportunity employer. We are committed to fostering an inclusive, innovative environment and celebrate our employees across age, race, color, ancestry, national origin, religion, disability, sex, gender identity or expression, sexual orientation, or any other protected status in accordance with applicable law. Our differences are strengths that enable us to support the growing and evolving needs of our customers, partners, and collaborators. If you have a disability that means there are preparations or accommodations we can make to help ensure you have a comfortable and positive interview experience, please fill out this form to let us know.

This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.

Headhunters and recruitment agencies may not submit resumes/CVs through this website or directly to managers. Unity does not accept unsolicited headhunter and agency resumes. Unity will not pay fees to any third-party agency or company that does not have a signed agreement with Unity.

Your privacy is important to us. Please take a moment to review our Prospect Privacy Policy and Applicant Privacy Policy. Should you have any concerns about your privacy, please contact us at [email protected].

#DIR #LI-AR1

*Note: Certain locations require a good faith disclosure of the base salary range for the role. The actual salary for the successful candidate may differ based on location, experience, and other job-related factors.

Gross pay salary
$278,100$347,600 USD

Skills Required

  • 8+ years in ML engineering
  • At least 3 years focused on on-device / edge inference optimization
  • Production deployment of transformer-based models on mobile or embedded hardware
  • Expertise with CoreML, TFLite, ONNX Runtime, ExecuTorch
  • Expert command of quantization and model compression techniques
  • Understanding of mobile SoC architectures
  • Proficiency in C++, Objective-C, Swift, Python
  • Ability to read and implement ML research papers
  • Technical leadership experience

Unity Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Unity and has not been reviewed or approved by Unity.

  • Healthcare Strength Core medical, dental, vision, life/disability, and mental‑health/EAP offerings are positioned as comprehensive across eligible locations. This breadth aligns with large‑tech standards and is highlighted in official materials.
  • Retirement Support A 401(k) plan with employer matching is part of the U.S. package. Retirement benefits are characterized as competitive and a stable element of total rewards.
  • Parental & Family Support Paid parental leave and family‑care support are emphasized, with indications of generous time off for new parents. These programs are presented as global in scope, with specifics verified by location.

Unity Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
4,500 Employees
Year Founded: 2004

What We Do

Unity [NYSE: U] is the world’s leading game engine, powering play for more than 3 billion consumers each month. The top mobile games in the world, the most played PC indie titles, the most innovative console games, and virtually all of the top XR and Web Games are developed, deployed, and grown in Unity. Unity also enables teams across industries like automotive, manufacturing, and healthcare to design, simulate, and collaborate in 3D — closing the gap between ideas and reality.

Why Work With Us

We believe the world is a better place with more creators in it. This is at the core of our business because we believe our technology can change the world. Our products give content creators the tools to not just entertain but to create innovative RT3D experiences and deliver better processes for almost every industry.

Gallery

Gallery

Similar Jobs

Wipfli Logo Wipfli

Senior Consultant

Cloud • Fintech • Software • Business Intelligence • Consulting • Financial Services
Remote or Hybrid
United States
3000 Employees
80K-108K Annually

Wells Fargo Logo Wells Fargo

Senior Registered Client Associate

Fintech • Financial Services
Hybrid
San Diego, CA, USA
205000 Employees
35K-52K Annually
Hybrid
6 Locations
205000 Employees
21-31 Hourly
Hybrid
6 Locations
205000 Employees
23-31 Hourly

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account