About Inworld
Inworld is a product-oriented research lab of top AI researchers and engineers, developing best-in-class realtime multimodal models and the only realtime orchestration platform optimized for thousands of queries per second.
We’ve raised more than $125M from Lightspeed, Section 32, Kleiner Perkins, Microsoft’s M12 venture fund, Founders Fund, Meta and Stanford, among others. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We’ve also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn's Top 10 Startups in the USA.
About the role
Voice is one of the key interfaces humans will interact with AI at scale. To make this reality, we are building the engine for the next generation of AI-driven software. Our primary focus is pushing the boundaries of speech modeling (STT & TTS). We approach this by researching and utilizing ML ideas that allow us to achieve state-of-the-art results (we recently ranked #1 on Artificial Analysis for Text-to-Speech models).
Working with audio is uniquely complex - arguably more so than text - because the solution space for how a specific phrase can be spoken is effectively infinite. This creates a vast landscape of challenges, from data collection and efficient training infra to creating RL alignment environments and ultra-low latency inference optimizations.
We are seeking Staff and Principal level AI Engineers to solve these challenges. You will be responsible for researching, building, optimizing, and deploying the production ML systems that thousands of developers integrate with their systems. Your work will focus on the difficult research and engineering problems of building the engine for the next generation of AI-driven software.
Qualifications
A PhD in a relevant technical field, or a BA/BS degree with equivalent research and/or engineering experience.
5+ years of combined experience in software development (e.g. with Python or C++) and applied ML engineering.
Demonstrated experience applying or researching Machine Learning in one or more of the following domains:
Speech or video processing
Natural Language Processing (NLP)
Action planning
Strong foundation in data structures, algorithms, and neural network architectures.
Proficiency with ML frameworks such as PyTorch.
A good fit for this role may have
A passion for learning and staying up-to-date with the latest advancements in ML/Voice AI research and its applications.
Ability to work collaboratively in a fast-paced environment with shifting priorities.
Familiarity with pre-training, fine-tuning, RLHF and evaluation of large language and speech models.
Knowledge of working with embedded systems and/or running ML on edge devices.
Strong background in mathematics and/or physics.
We believe in the power of in-person collaboration to solve the hardest problems and foster a strong team culture. We offer relocation assistance and look forward to you joining us in our Mountain View office.
The base salary range for this full-time position is $260,000 - $385,000+ bonus + equity + benefits.
Top Skills
What We Do
Inworld is a fully integrated platform for AI characters that goes beyond large language models (LLMs) – by adding configurable safety, knowledge, memory, narrative controls, multimodality, and more.
Inworld uses advanced AI to build generative characters whose personalities, thoughts, memories, and behaviors are designed to mimic the deeply social nature of human interaction. The Inworld platform lets you create characters with personality and contextual awareness to keep them in-world and on brand. Integrations make it easy for developers to deploy characters into immersive experiences, while scale and performance are optimized for real-time experiences.
We are a team of experts that have pioneered conversational AI platforms and generative models at API.AI (acquired by Google and renamed Dialogflow), Google and DeepMind. We are continuing to build out our incredibly talented team, with experts in generative language models, emotions, speech synthesis, multimodal interaction, design, and 3D animation.
Inworld is backed by top-tier investors including Section 32, Intel Capital, BITKRAFT Ventures, Kleiner Perkins, Founders Fund, First Spark Ventures, The Venture Reality Fund, CRV, Meta, Microsoft’s M12 fund, Micron Ventures, LG Technology Ventures, NTT Docomo Ventures, and SK Telecom Venture Capital. Inworld was one of six companies selected for the 2022 Disney Accelerator. Prominent angels include Twitch Co-Founder, Kevin Lin; Oculus Co-Founder, Nate Mitchell; Animoca Brands Co-Founder, Yat Siu; The Sandbox Co-Founder, Sebastien Borget and NaHCO3, the family office of Riot Games Co-Founder, Marc Merrill.








