Software Engineer, LLM Inference Engine and Product

Sorry, this job was removed at 12:14 a.m. (CST) on Tuesday, Aug 05, 2025
Be an Early Applicant
San Francisco, CA
In-Office
Artificial Intelligence • Software
The Role

Job title: Software Engineer, LLM Inference Engine and Product / Member of Technical Staff

Who We Are
WaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive.

Role overview: The Software Engineer, LLM Inference Engine and Product will focus on developing and optimizing a real-time inference engine for multimodal large language models (LLMs) that handle audio and text inputs seamlessly. This role involves leveraging technologies such as LiveKit, RTC engines, WebRTC, and FastAPI to create an efficient, real-time API layer. You will contribute to cutting-edge AI systems that enable smooth user experiences across platforms, including iOS, Android, and desktop.

Key Responsibilities

  • Real-time Inference Development: Build and optimize a robust inference engine that supports multimodal LLMs, handling real-time audio and text inputs.

  • Technology Integration: Leverage tools like LiveKit, RTC engines, WebRTC, and FastAPI to enable low-latency, real-time communication and inference.

  • End-to-End Pipeline Design: Create and maintain the complete inference pipeline, from data ingestion to model serving, ensuring real-time performance.

  • Cross-platform Compatibility: Ensure the inference engine operates efficiently across various platforms, including mobile (iOS/Android) and desktop.

  • Optimization & Performance Tuning: Optimize the inference system to reduce latency, improve throughput, and enhance user experience.

  • API Development: Design and maintain scalable APIs to support real-time LLM interaction for diverse applications.

Required Skills & Qualifications

  • Inference Engine Expertise: Proven experience in building and optimizing inference engines for multimodal AI systems, particularly combining audio and text inputs.

  • Technical Proficiency: Strong experience with LiveKit, RTC engines, WebRTC, and FastAPI for real-time communication and model inference.

  • Real-time System Design: Expertise in creating real-time pipelines and maintaining low-latency performance in production systems.

  • Cross-platform Development: Familiarity with iOS, Android, and desktop app development, ensuring seamless integration with inference systems.

  • Performance Optimization: Proficiency in optimizing inference engines to reduce latency and improve computational efficiency.

  • API Development: Experience in designing and maintaining APIs for real-time AI applications.

Minimum Experience

  • 4-5 years of relevant professional experience is required

Similar Jobs

Cox Enterprises Logo Cox Enterprises

Director, Vendor Performance Management (Cox Automotive Fleet Client Solutions and Delivery)

Automotive • Cloud • Greentech • Information Technology • Other • Software • Cybersecurity
Remote or Hybrid
CA, USA
50000 Employees
132K-219K Annually

Square Logo Square

Account Executive

eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Remote or Hybrid
8 Locations
12000 Employees
130K-234K Annually

Square Logo Square

Product Manager

eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Remote or Hybrid
8 Locations
12000 Employees
139K-245K Annually

CDW Logo CDW

VMware SME - Cleared (ONSITE)

Artificial Intelligence • eCommerce • Information Technology • Internet of Things • Automation
Remote or Hybrid
CA, USA
15100 Employees
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
9 Employees
Year Founded: 2024

What We Do

WaveForms AI is an Audio LLM research and product company aiming to solve the Speech Turing Test and create Emotional General Intelligence. Learn more at waveforms.ai/about.

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account