AssemblyAI

Senior Research Engineer

Reposted 3 Days Ago

Easy Apply

Be an Early Applicant

Hiring Remotely in USA

Remote

240K-275K Annually

Senior level

Artificial Intelligence

Today’s top Voice AI companies rely on AssemblyAI’s speech-to-text and speech understanding models to launch groundbreak

The Role

This role involves optimizing large-scale distributed training and inference systems, implementing deep learning optimizations, and collaborating across teams to enhance AI models.

Summary Generated by Built In

About AssemblyAI

AssemblyAI builds the best-in-class Speech AI models powering the next generation of voice applications. Our models serve 600M+ inference calls monthly, process 1M+ hours of audio daily, and power 2 billion+ end-user experiences—from voice agents and meeting assistants to contact centers and medical scribes. Companies like Zoom, Granola, Fireflies, Cluely, and Calabrio rely on AssemblyAI to ship production-ready voice AI.

We're at an inflection point in Speech AI. We released Universal-Streaming in mid-2025, and it has quickly earned its place as the model offering the best accuracy-latency-cost tradeoff on the market. The adoption has been significant: we now process ~1.5M streaming hours per week, with 25x usage growth in the last six months alone. Our research team drives these advances and ships with relentless velocity. Since releasing Universal-Streaming, we've already launched keyterms prompting feature and multilingual support—with more significant improvements on the roadmap.

We've raised $115M+ from Accel, Insight Partners, Y Combinator's AI Fund, Patrick and John Collison, Nat Friedman, and Daniel Gross. We're a remote team building one of the next great AI companies—and we're looking for researchers who will shape its future.

About the Role

We are seeking a highly skilled Senior Research Engineer to collaborate closely with both Research and Engineering teams. The role involves diagnosing and resolving bottlenecks across large-scale distributed training, data processing, and inference systems, while also driving optimizations for existing high-performance pipelines.

The ideal candidate possesses a deep understanding of modern deep learning systems, combined with strong engineering expertise in areas such as layer-level optimization, large-scale distributed training, streaming, low-latency and asynchronous inference, inference compilers, and advanced parallelization techniques.

This is a cross-functional role requiring strong technical rigor, attention to detail, intellectual curiosity, and excellent communication skills. The position is embedded within the Research team and is responsible for developing and refining the technical foundation that enables cutting-edge research and translates its outcomes into production, bridging research and production engineering.

What You'll Do

Investigate and mitigate performance bottlenecks in large-scale distributed training and inference systems.
Develop and implement both low-level (operator/kernel) and high-level (system/architecture) optimization strategies.
Translate research models and prototypes into highly optimized, production-ready inference systems.
Explore and integrate inference compilers such as TensorRT, ONNX Runtime, AWS Neuron and Inferentia, or similar technologies.
Design, test, and deploy scalable solutions for parallel and distributed workloads on heterogeneous hardware.
Facilitate knowledge transfer and bidirectional support between Research and Engineering teams, ensuring alignment of priorities and solutions.

What You'll Need

Strong expertise in the Python ecosystem and major ML frameworks (PyTorch, JAX).
Experience with lower-level programming (C++ or Rust preferred).
Deep understanding of GPU acceleration (CUDA, profiling, kernel-level optimization); TPU experience is a strong plus.
Proven ability to accelerate deep learning workloads using compiler frameworks, graph optimizations, and parallelization strategies.
Solid understanding of the deep learning lifecycle: model design, large-scale training, data processing pipelines, and inference deployment.
Strong debugging, profiling, and optimization skills in large-scale distributed environments.
Excellent communication and collaboration skills, with the ability to clearly prioritize and articulate impact-driven technical solutions.

Pay Transparency:

AssemblyAI strives to recruit and retain exceptional talent from diverse backgrounds while ensuring pay equity across our team. Our salary ranges are set to be competitive for our size, stage, and industry, and reflect just one component of the full compensation, benefits, and rewards we offer.

Salary determinations consider a variety of factors, including relevant experience, technical depth, skills demonstrated during the interview process, and maintaining internal equity with peers on the team. The range shared below represents a general expectation for the posted position. However, we are open to considering candidates who may fall above or below the outlined experience level—in those cases, we will communicate any adjustments to the expected salary range.

The range provided applies to candidates located in the United States. For candidates outside of the U.S., compensation ranges may differ; any adjustments will be communicated throughout the interview process.

Salary range: $210,000 - $309,000

The expected base compensation for this role is listed above. Our total compensation package includes competitive equity grants, 100% employer-paid benefits, and the flexibility of being fully remote. 401k match up to 4% for US-based full time team members.

Working at AssemblyAI

We are a small but mighty group of startup veterans and experienced AI researchers with over 20 years of expertise in Machine Learning, Speech Recognition, and NLP. As a fully remote team, we’re looking for people to join our team who are ambitious, curious, and lead with integrity. We’re still in the early days of AI and of AssemblyAI’s journey, and are looking for teammates who won’t just fit in, but will help us define and build our company culture.

We’re committed to creating a space where our employees can bring their full selves to work and have equal opportunity to succeed. No matter your race, gender identity or expression, sexual orientation, religion, origin, ability, age, veteran status, if joining this mission speaks to you, we encourage you to apply!

Using AI to Interview:

If you’re selected for an interview, please review this resource to better understand how AssemblyAI approaches the use of AI in our interview process.

GDPR privacy notice:

Candidates from the EU should review this job applicant privacy notice before applying.

Keep Exploring AssemblyAI:Keep Exploring AssemblyAI:

Check us out on YouTube!

Learn more about AI models for speech recognition

Speech-to-Text | Speech Understanding | LLM Gateway | Try the Playground

Our $50M Series C fundraise

Top Skills

Aws Neuron

C++

Cuda

Inferentia

Jax

Onnx Runtime

Python

PyTorch

Rust

Tensorrt

View all jobs at AssemblyAI

View AssemblyAI Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: New York, New York

75 Employees

Year Founded: 2017

What We Do

AssemblyAI’s speech-to-text and speech understanding models are the market leader in accuracy, reliability, and performance — providing the outputs you need to accelerate growth and build enterprise-grade product experiences.

Our customers include Cluely, Granola, Metaview and Zoom. We’ve raised funding by leading investors including Accel, Insight Partners, Y Combinator’s AI Fund, Patrick and John Collision, Nat Friedman, and Daniel Gross. As part of a huge and emerging market, AssemblyAI is well on its way to becoming the leader in applied AI.

Why Work With Us

We're a remote-first team of interdisciplinary research leaders, scientists, and engineers focused on building and scaling new state-of-the-art Speech AI models that are accurate, capable, easy to use, and safe. Today, our technology is being widely deployed to recognize, understand, and process human speech for thousands of customers.