Senior QA Engineer

Posted 13 Hours Ago
Be an Early Applicant
2 Locations
In-Office or Remote
Senior level
Artificial Intelligence • Healthtech
The Role
Lead QA for AI-driven voice and web platforms: design and maintain Playwright automated suites, validate TTS/voice pipelines and Claude LLM integrations, build Node.js test tooling, deploy and triage test infra on GCP, define quality metrics, and mentor junior QA engineers.
Summary Generated by Built In
About Us:

Steer Health helps healthcare organizations improve patient access, reduce operational burden, and recover revenue through AI-native workflow automation. Our lead product, Luna AI, acts as a voice-based digital workforce, handling patient access workflows such as scheduling, intake, and follow-up. We sit on top of existing EHR infrastructure and focus on measurable operational outcomes.
About the Role

We are looking for a Senior QA Engineer who thrives at the intersection of AI, voice automation, and cloud-native systems. You will own quality across our platform — from testing LLM-powered features and voice pipelines to ensuring robust end-to-end coverage on GCP infrastructure. You will work closely with product, engineering, and AI teams to embed quality from the ground up.


Requirements
  • Design, build, and maintain automated test suites using Playwright for web and API surfaces, including AI-generated content flows. 
  • Lead QA strategy for voice automation pipelines built on ElevenLabs — developing test cases for synthesis quality, latency, and failure modes. 
  • Validate Claude (Anthropic) integrations: prompt-response accuracy, edge case handling, safety behaviors, and output consistency across builds. 
  • Build and maintain Node.js-based test tooling, harnesses, and custom reporters for CI/CD pipelines. 
  • Deploy, monitor, and triage test infrastructure on Google Cloud Platform — leveraging Cloud Run, GCS, and Pub/Sub for scalable test execution. 
  • Define and track quality metrics: test coverage, flakiness rates, mean-time-to-detect, and regression velocity. 
  • Collaborate with engineers during design reviews to surface testability gaps and advocate for observable, fault-tolerant system design. 
  • Mentor junior QA engineers and establish team-wide standards for test authoring, review, and maintenance. 
    Required Qualifications 
  • 5+ years of QA engineering experience, with at least 2 years on systems that include LLMs, AI APIs, or speech/audio pipelines. 
  • Expert-level Playwright skills — authoring resilient selectors, managing parallel workers, and debugging flaky tests at scale. 
  • Proficient Node.js developer — comfortable writing custom test runners, CLI tooling, and service mocks in TypeScript/JavaScript. 
  • Hands-on GCP experience: deploying workloads to Cloud Run or GKE, querying logs in Cloud Logging, configuring artifact storage in GCS. 
  • Familiarity with ElevenLabs or comparable TTS/voice APIs — understanding synthesis parameters, webhook flows, and audio quality evaluation. 
  • Practical experience testing Claude or other LLMs — designing determinism-aware test strategies, evaluating prompt regressions, and building evals. 
  • Strong understanding of REST, WebSocket, and gRPC protocols for API-level testing. 
  • Experience integrating test suites into CI/CD pipelines (GitHub Actions, Cloud Build, or similar). 
     
  • Nice to Have 
  • Experience writing custom LLM evals or using evaluation frameworks such as PromptFoo or Braintrust. 
  • Background in audio signal quality assessment or speech intelligibility testing. 
  • Familiarity with observability tooling: OpenTelemetry, Datadog, or GCP Cloud Monitoring. 
  • Knowledge of accessibility testing standards (WCAG 2.1) and assistive technology compatibility. 
    Core Technology Stack:
    Google Cloud Platform (GCP) 
  • Cloud Run, GCS, Pub/Sub, Cloud Logging, GKE for scalable test infrastructure 

    ElevenLabs
    Voice Automation 
    TTS pipeline testing, synthesis quality evaluation, webhook and latency validation 
  • Node.js / TypeScript 
    Custom test runners, service mocks, CLI tooling, and CI/CD integration 
  • Playwright 
    End-to-end and API-level browser automation with parallel execution 
  • Claude (Anthropic) 
    LLM integration QA, prompt regression testing, and output evaluation  

Benefits

●      Competitive base salary commensurate with experience

●      High-autonomy environment with direct access to executive leadership

●      Structured operating cadence with clear goals, metrics, and career growth targets

●      Work that touches 19M+ patients — the mission is real
Flexible PTOs policy

Skills Required

  • 5+ years of QA engineering experience, with at least 2 years on systems that include LLMs, AI APIs, or speech/audio pipelines.
  • Expert-level Playwright skills — authoring resilient selectors, managing parallel workers, and debugging flaky tests at scale.
  • Proficient Node.js developer — comfortable writing custom test runners, CLI tooling, and service mocks in TypeScript/JavaScript.
  • Hands-on GCP experience: deploying workloads to Cloud Run or GKE, querying logs in Cloud Logging, configuring artifact storage in GCS.
  • Familiarity with ElevenLabs or comparable TTS/voice APIs — understanding synthesis parameters, webhook flows, and audio quality evaluation.
  • Practical experience testing Claude or other LLMs — designing determinism-aware test strategies, evaluating prompt regressions, and building evals.
  • Strong understanding of REST, WebSocket, and gRPC protocols for API-level testing.
  • Experience integrating test suites into CI/CD pipelines (GitHub Actions, Cloud Build, or similar).
  • Experience writing custom LLM evals or using evaluation frameworks such as PromptFoo or Braintrust.
  • Background in audio signal quality assessment or speech intelligibility testing.
  • Familiarity with observability tooling: OpenTelemetry, Datadog, or GCP Cloud Monitoring.
  • Knowledge of accessibility testing standards (WCAG 2.1) and assistive technology compatibility.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Irving, TX
26 Employees
Year Founded: 2021

What We Do

Steer Health helps healthcare organizations thrive by ensuring revenue generation, cost savings and exceptional patient experience. Our AI-powered growth and automation platform connects the marketing, growth, access, operational, clinical, and financial pathways to attract, guide, and retain patients. With Steer, hospitals and medical groups achieve higher patient volumes, cost savings, superior patient experiences, and happier staff and providers.

Similar Jobs

Weekday, Inc. Logo Weekday, Inc.

Senior Quality Assurance Engineer

Artificial Intelligence • HR Tech • Professional Services • Software
Remote
India
2M-3M Annually

Weekday, Inc. Logo Weekday, Inc.

Senior Quality Assurance Engineer

Artificial Intelligence • HR Tech • Professional Services • Software
Remote
India
Remote or Hybrid
2 Locations
132624 Employees
Remote or Hybrid
2 Locations
132624 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account