Applied ML Engineer

Posted 8 Days Ago
San Francisco, CA
Hybrid
Mid level
Healthtech • Software
The Role
The Applied ML Engineer will productionize and scale machine learning systems for a voice AI platform, optimizing deployment and performance in healthcare environments.
Summary Generated by Built In

About Knowtex

Knowtex is building the future of voice AI operating systems for clinicians, transforming how healthcare documentation happens at the point of care. Founded by Stanford AI scientists with deep clinical experience, we're experiencing explosive growth across both commercial health systems and federal healthcare, with our ambient documentation platform scaling rapidly to thousands of clinicians across hundreds of specialties. We're at an inflection point where cutting-edge AI meets real clinical impact, giving clinicians hours back each day to focus on what matters most - their patients.

Position Overview

We are seeking an Applied ML Engineer to productionize and scale machine learning systems powering our voice AI platform. This role bridges research and engineering — transforming models into reliable, low-latency, production-grade systems deployed across enterprise healthcare environments.

You will work closely with ML Scientists, Backend Engineers, and Platform teams to optimize inference performance, build evaluation pipelines, and ensure robust model deployment in regulated environments.

Key Responsibilities

  • Productionize ML models for real-time clinical applications

  • Optimize inference pipelines for low latency and high throughput

  • Deploy and scale models using AWS-based infrastructure

  • Build automated evaluation and regression testing frameworks for LLM outputs

  • Implement monitoring systems for model performance and drift detection

  • Collaborate with Backend teams to integrate ML services into APIs and workflows

  • Improve model efficiency through quantization, batching, caching, and optimization techniques
    Support specialty-level model evaluation and performance analysis

  • Contribute to CI/CD workflows for ML deployment

Required Qualifications

  • 3–7+ years of experience in machine learning engineering or applied ML roles

  • Strong proficiency in Python and PyTorch (or TensorFlow)

  • Experience deploying ML models in production environments

  • Familiarity with transformer architectures and large language models

  • Experience with model optimization techniques (quantization, distillation, pruning)

  • Experience working with cloud infrastructure (AWS preferred)

  • Strong software engineering fundamentals and debugging skills

Preferred Qualifications

  • Experience with speech recognition systems or NLP pipelines

  • Experience with Triton Inference Server or similar deployment frameworks

  • Familiarity with healthcare data or clinical documentation workflows

  • Experience working in regulated environments (HIPAA, GovCloud, etc.)

  • Knowledge of medical coding systems (ICD-10, CPT)

Technical Environment

  • Python, PyTorch / TensorFlow

  • Transformer-based LLM architectures

  • AWS (SageMaker, ECS, Lambda, S3)

  • Triton Inference Server

  • CI/CD pipelines for ML deployment

  • Observability tools for performance and drift monitoring

Compensation & Benefits

  • Meaningful equity compensation

  • Unlimited PTO

  • Premium health, dental, and vision coverage

  • 401(k) plan

Top Skills

AWS
Python
PyTorch
TensorFlow
Triton Inference Server
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, California
12 Employees

What We Do

Knowtex captures natural conversation between a clinician and patient, identifies relevant medical information, and generates a pre-filled note with medical coding suggestions, for clinician review and sign-off.

For clinicians, we’re decreasing time spent on manual documentation and allowing them to focus on patient care. For healthcare organizations, we’re improving accuracy and efficiency in medical billing and coding to help with reimbursements.

Similar Jobs

Faire Logo Faire

Artificial Intelligence Engineer

eCommerce • Fintech • Machine Learning • Retail
Easy Apply
In-Office
San Francisco, CA, USA
1200 Employees
308K-424K Annually

Cadence Design Systems Logo Cadence Design Systems

Applied ML – Functional Verification Engineer

Artificial Intelligence • Cloud • Hardware • Software • Semiconductor
In-Office
San Jose, CA, USA
8216 Employees
154K-286K Annually

Cadence Design Systems Logo Cadence Design Systems

Applied ML – Functional Verification Engineer

Artificial Intelligence • Cloud • Hardware • Software • Semiconductor
In-Office
San Jose, CA, USA
8216 Employees
115K-213K Annually

Cadence Design Systems Logo Cadence Design Systems

Applied ML - Functional Verification Engineer

Artificial Intelligence • Cloud • Hardware • Software • Semiconductor
In-Office
San Jose, CA, USA
8216 Employees
154K-286K Annually

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account