QA Engineer

Reposted 7 Days Ago
Country States, Pájaros Barrio, Bayamón
In-Office
Mid level
Artificial Intelligence • Information Technology
The Role
As a QA Manager, you will develop scalable QA plans for evaluating AI agents, mentor a QA team, and collaborate with engineering to enhance AI models.
Summary Generated by Built In
Scaled Cognition is the world’s only model lab dedicated exclusively to customer experience and pioneering agentic models purpose-built for reliable action-taking enterprise applications. Backed by Khosla Ventures, the company’s flagship Agentic Pretrained Transformer (APT) eliminates hallucinations, enforces enterprise policies and increases reliability in real-world CX workflows. Founded by serial AI entrepreneurs, former Microsoft Corporate Vice President of Conversational AI Dan Roth, and UC Berkeley AI Professor Dan Klein, and built by a team of world-class PhD researchers and engineers, Scaled Cognition advances the science of agentic AI to deliver safe, policy-aligned automation that enterprises can trust.
 

As an QA Manager at Scaled Cognition you will:

  • Develop and implement scalable QA plans for evaluating AI agents, defining key performance metrics to measure progress over time.
  • Collaborate with product and engineering teams to document findings, test fixes, and recommend improvements to the underlying models and conversational flows.
  • Lead and mentor a team of QA engineers, establishing best practices and processes for testing conversational AI agents.

Example projects could include:

  • Building test sets to track regressions, agent robustness, and end-to-end testing.
  • Reviewing and analyzing voice and chat transcripts, and quickly identify conversational gaps and provide data for faster iteration on customer deployments.
  • Designing and automating testing pipelines to scale QA capacity across a diverse portfolio of customers and to continuously evaluate the performance of our AI agents.

Preferred Qualifications: 

  • Intermediate-level proficiency in Python and experience building and testing conversational AI/LLM systems.
  • Background in implementing evaluation benchmarks, and production monitoring metrics.
  • Experience working with libraries and tooling common in the AI/LLM ecosystem.
  • Demonstrated precision in documenting test plans, test cases, and bug reports, ensuring data is accurate and easily understandable by cross-functional teams.
  • Experience with leveraging AI-powered assistants/tooling to enable rapid iteration, prototyping, and accelerated delivery.

Top Skills

Python
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
21 Employees
Year Founded: 2023

What We Do

Scaled Cognition is developing a new generation of rational, controllable AI models deployable as domain experts for grounded, real-world applications.

Similar Jobs

In-Office
Country States, Pájaros Barrio, Bayamón, PRI
17 Employees
3-4 Annually
In-Office
Yauco, PRI
9059 Employees

Hewlett Packard Enterprise Logo Hewlett Packard Enterprise

Quality Assurance Engineer

Artificial Intelligence • Cloud • Information Technology • Consulting
In-Office
2 Locations
61628 Employees
In-Office
Juncos, PRI
33694 Employees

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account