Deepgram

Senior Software Engineer, AI Inference

Sorry, this job was removed at 01:21 a.m. (CST) on Wednesday, Oct 16, 2024

Hiring Remotely in California

Remote

165K-220K Annually

Internship

Artificial Intelligence • Machine Learning • Natural Language Processing • Software

The world’s first AI speech API for transcription with human-level understanding.

The Role

Company Overview

Deepgram is a foundational AI company on a mission to transform human-machine interaction using natural language. We give any developer access to the fastest, most powerful voice AI platform including access to models for speech-to-text, text-to-speech, and spoken language understanding with just an API call. From transcription to sentiment analysis to voice synthesis, Deepgram is the preferred partner for builders of voice AI applications.

Opportunity

We are seeking a backend engineer focused on AI inference to join the team powering Deepgram’s core speech inference APIs. You’ll implement and optimize inference code, experiment with cutting-edge technologies, and develop, maintain, and deploy the stack of services behind our blazing-fast, massive-throughput inference system. This role blends work on backend services and systems with domain specialty in neural networks and GPU programming. Our team owns the applications that serve api.deepgram.com and empowers builders of innovative speech products by focusing on a world-class combination of reliability, efficiency, and latency.

What You’ll Do

Implement inference for novel model architectures developed by Deepgram’s trailblazing research team
Develop, test, and deploy application code for massive-scale production services
Debug complex system issues that include networking, scheduling, and high-performance computing interactions
Build tooling for internal analysis and benchmarking to identify opportunities for efficiency improvements
Experiment with optimization techniques for ML workloads on NVIDIA GPUs and ship the key wins to prod

You’ll Love This Role If You

Think of yourself as a generalist while enjoying learning deeply in specific areas, causing you to go from debugging a customer issue one day to designing an algorithm the next
Like sipping piña coladas and getting caught in the rain
Enjoy taking ownership of features from early collaborations with researchers through testing in production
Love getting nitty-gritty with profilers, hardware architectures, and inference algorithms
Want to work within the context of a humble, collaborative team that collectively owns mission-critical production services

It’s Important to Us That You Have

The ability to work collaboratively in a fast-paced environment and adapt to changing priorities
Proven industry experience building and shipping production services
Strong confidence in a lower-level language like C, C++, or Rust
Experience slicing large projects or initiatives into smaller experiments or incremental improvements
Expertise in a ML framework like Torch or Tensorflow
Experience with GPU programming using tools like CUDA or libraries like cuDNN, cuBLAS, etc.

It Would Be Great If You Also Had

Extensive professional experience with Rust and C++
Experience optimizing ML workloads in production
Familiarity with GPU hardware architecture and its impact on inference pipelines

Backed by prominent investors including Y Combinator, Madrona, Tiger Global, Wing VC and NVIDIA, Deepgram has raised over $85 million in total funding after closing our Series B funding round last year. If you're looking to work on cutting-edge technology and make a significant impact in the AI industry, we'd love to hear from you!

Deepgram is an equal opportunity employer. We want all voices and perspectives represented in our workforce. We are a curious bunch focused on collaboration and doing the right thing. We put our customers first, grow together and move quickly. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, gender identity or expression, age, marital status, veteran status, disability status, pregnancy, parental status, genetic information, political affiliation, or any other status protected by the laws or regulations in the locations where we operate.

We are happy to provide accommodations for applicants who need them.

Compensation Range: $165K - $210K

What the Team is Saying

View all jobs at Deepgram

View Deepgram Profile

Report Job

The Company

HQ: Ann Arbor , MI

90 Employees

Hybrid Workplace

Year Founded: 2015

What We Do

Our end-to-end deep neural network is revolutionizing the speech-to-text market and taking on the big guys. We’re redefining what companies can do with voice by offering a platform with AI architectural advantage, not legacy tech retrofitted with AI. We’ve raised over $56M and have been recognized as an Inc. Best Workplace (2021 and 2022), a Forbes Top 50 AI Company to Watch (2021), and a CB Insights Top 100 AI Startup (2021), among others. Join us. We want to hear what you’ve got to say.

An NVIDIA partner and Y Combinator company.

Why Work With Us

Our culture, like our product, is constantly learning and evolving, but the heart of our team is enduring. We are a self-motivated, positive, passionate, and competitive group of people. At Deepgram, we put an emphasis on being ourselves, being curious, growing together, and being human. We are a unique bunch who celebrate our differences.

Gallery

Deepgram Offices

Learn More

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

We currently have a hybrid business model with a nationally distributed workforce and two physical offices, one in Ann Arbor, MI and another in Burlingame, CA.

Typical time on-site: Flexible

HQAnn Arbor, MI

Burlingame, CA

Learn more