On-Device Machine Learning Engineer

Reposted 6 Days Ago
Be an Early Applicant
Austin, TX
In-Office
Mid level
Artificial Intelligence • Cloud • Machine Learning • Software
The Role
The On-Device Machine Learning Engineer will optimize and deploy ML models on consumer hardware, enhancing performance while ensuring privacy and reliability. Responsibilities include model optimization, lifecycle management, and collaboration with product teams to achieve high user experience benchmarks.
Summary Generated by Built In

About Us:

webAI is pioneering the future of artificial intelligence by establishing the first distributed AI infrastructure dedicated to personalized AI. We recognize the evolving demands of a data-driven society for scalability and flexibility, and we firmly believe that the future of AI lies in distributed processing at the edge, bringing computation closer to the source of data generation. Our mission is to build a future where a company's valuable data and intellectual property remain entirely private, enabling the deployment of large-scale AI models directly on standard consumer hardware without compromising the information embedded within those models. We are developing an end-to-end platform that is secure, scalable, and fully under the control of our users, empowering enterprises with AI that understands their unique business. We are a team driven by truth, ownership, tenacity, and humility, and we seek individuals who resonate with these core values and are passionate about shaping the next generation of AI.

About the Role

We’re looking for an On-Device Machine Learning Engineer to bring modern ML capabilities directly onto consumer hardware, specifically fast, private, and reliable. You’ll own the design, optimization, and lifecycle of models running locally (e.g., iPhone/iPad/Mac-class devices), with a sharp focus on latency, battery, thermal behavior, and real-world UX. This role sits at the intersection of ML systems, product engineering, and performance tuning, and will help power local RAG, memory, and personalized experiences without relying on the network.

What You’ll Do

On-device model optimization and deployment

  • Convert, optimize, and deploy models to run efficiently on-device using Core ML and/or MLX.

  • Implement quantization strategies (e.g., 8-bit / 4-bit where applicable), compression, pruning, distillation, and other techniques to meet performance targets.

  • Profile and improve model execution across compute backends (CPU/GPU/Neural Engine where relevant), and reduce memory footprint.

Local RAG + memory systems

  • Build and optimize local retrieval pipelines (embeddings, indexing, caching, ranking) that work offline and under tight resource constraints.

  • Implement local memory systems (short/long-term) with careful attention to privacy, durability, and performance.

  • Collaborate with product/design to translate “memory” behavior into concrete technical architectures and measurable quality targets.

Model lifecycle on consumer hardware

  • Own the on-device model lifecycle: packaging, versioning, updates, rollback strategies, on-device A/B testing approaches, telemetry, and quality monitoring.

  • Build robust evaluation and regression suites that reflect real device constraints and user workflows.

  • Ensure models degrade gracefully (low-power mode, thermals, backgrounding, OS interruptions).

Performance, reliability, and user experience

  • Treat battery, thermal, and latency as first-class product requirements: instrument, benchmark, and optimize continuously.

  • Design inference pipelines and scheduling strategies that respect app responsiveness, animations, and UI smoothness.

  • Partner with platform engineers to integrate ML into production apps with clean APIs and stable runtime behavior.

What We’re Looking For
  • Strong experience shipping ML features into production, ideally including mobile / edge / consumer devices.

  • Hands-on proficiency with Core ML and/or MLX, and the practical realities of running models locally.

  • Solid understanding of quantization and optimization techniques for inference (accuracy/perf tradeoffs, calibration, benchmarking).

  • Experience building or operating retrieval systems (embedding generation, vector search/indexing, caching strategies)—especially under resource constraints.

  • Fluency in performance engineering: profiling, latency breakdowns, memory analysis, and tuning on real devices.

  • Strong software engineering fundamentals: maintainable code, testing, CI, and debugging across complex systems.

Nice to Have
  • Experience with on-device LLMs, multimodal models, or real-time interactive ML features.

  • Familiarity with Metal / GPU compute, or performance tuning of ML workloads on Apple platforms.

  • Experience designing privacy-preserving personalization and memory (local-first data handling, encryption, retention policies).

  • Experience building developer tooling for model packaging, benchmarking, and release management.

  • Prior work on offline-first architectures, edge inference, or battery/thermal-aware scheduling.

We at webAI are committed to living out the core values we have put in place as the foundation on which we operate as a team. We seek individuals who exemplify the following:

  • Truth - Emphasizing transparency and honesty in every interaction and decision.

  • Ownership - Taking full responsibility for one’s actions and decisions, demonstrating commitment to the success of our clients.

  • Tenacity - Persisting in the face of challenges and setbacks, continually striving for excellence and improvement.

  • Humility - Maintaining a respectful and learning-oriented mindset, acknowledging the strengths and contributions of others.

Benefits:

  • Competitive salary and performance-based incentives.

  • Comprehensive health, dental, and vision benefits package.

  • 401k Match (US-based only)

  • $200/mos Health and Wellness Stipend

  • $400/year Continuing Education Credit

  • $500/year Function Health subscription (US-based only)

  • Free parking, for in-office employees

  • Unlimited Approved PTO

  • Parental Leave for Eligible Employees

  • Supplemental Life Insurance


webAI is an Equal Opportunity Employer and does not discriminate against any employee or applicant on the basis of age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We adhere to these principles in all aspects of employment, including recruitment, hiring, training, compensation, promotion, benefits, social and recreational programs, and discipline. In addition, it is the policy of webAI to provide reasonable accommodation to qualified employees who have protected disabilities to the extent required by applicable laws, regulations and ordinances where a particular employee works.

Top Skills

Core Ml
Mlx
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Austin, Texas
49 Employees
Year Founded: 2020

What We Do

webAI is designed to streamline the training, deployment, and execution of AI models by offering a unified execution layer for AI that seamlessly integrates cloud-based services and local devices. Our goal is to revolutionize the AI industry by laying the foundation for the development of Super Intelligence (SI).

By approaching AI execution in a heterogeneous manner, we aim to create scalable AI products that are industry leading. Our platform is dedicated to enhancing the performance and accessibility of AI technologies, making them more efficient and user-friendly for both businesses and individuals.

Similar Jobs

Babylist Logo Babylist

General Manager

eCommerce • Healthtech • Kids + Family • Retail • Social Media
Easy Apply
Remote or Hybrid
United States
300 Employees
187K-224K Annually

Yooz Logo Yooz

Collections Specialist

Artificial Intelligence • Cloud • Fintech • Machine Learning • Software • Financial Services • Automation
In-Office
Coppell, TX, USA
470 Employees

Trumid Logo Trumid

User Interface Engineer

Fintech • Information Technology • Software • Financial Services
Easy Apply
Remote or Hybrid
USA
200 Employees
150K-200K Annually

Xero Logo Xero

Enterprise Partner Success Manager (US, Remote)

Cloud • Fintech • Information Technology • Machine Learning • Software
Remote or Hybrid
7 Locations
4500 Employees

Similar Companies Hiring

Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account