AI Quality Engineer

Posted 6 Days Ago
Be an Early Applicant
Cincinnati, OH, USA
In-Office
90K-115K Annually
Mid level
Fintech • Software • Financial Services
The Role
As an AI Quality Engineer, you will oversee the testing and improvement of a document extraction system using LLM APIs, ensuring accurate data extraction from financial documents while collaborating on prompt engineering and evaluation frameworks.
Summary Generated by Built In

About Luma Financial Technologies

Founded in 2018, Luma Financial Technologies (“Luma”) has pioneered a cutting-edge fintech software platform that has been adopted by broker/dealer firms, RIA offices, and private banks around the world. By using Luma, institutional and retail investors have a fully customizable, independent, buy-side technology platform that helps financial teams more efficiently learn about, research, purchase, and manage alternative investments as well as annuities. Luma gives these users the ability to oversee the full, end-to-end process lifecycle by offering a suite of solutions. These include education resources and training materials; creation and pricing of custom structured products; electronic order entry; and post-trade management. By prioritizing transparency and ease of use, Luma is a multi-issuer, multi-wholesaler, and multi-product option that advisors can utilize to best meet their clients’ specific portfolio needs. Headquartered in Cincinnati, OH, Luma also has offices in New York, NY, Miami, FL, Zurich, Switzerland and Lisbon, Portugal. For more information, please visit Luma’s website.

About the role

Luma Fintech is building a best-in-class LLM-powered document parsing pipeline that extracts structured data from complex financial product term sheets. We are seeking an AI Quality Engineer to own the daily testing, analysis, and iterative improvement of our Claude API-based extraction system. This role sits at the intersection of financial data operations and applied AI, you will be the person who closes the loop between what the model outputs and what the schema demands.

What you'll do

  • Run daily accuracy evaluations against a defined extraction schema, tracking field-level performance across structured product types (autocallables, CLNs, barrier notes, etc.)
  • Design and maintain test cases, regression suites, and gold-standard document sets to benchmark extraction quality over time
  • Diagnose extraction failures, distinguishing between prompt logic issues, schema ambiguity, model hallucinations, and edge-case document formats
  • Iterate on prompt engineering, system instructions, and context design to improve field-level extraction accuracy
  • Work alongside the AI Engineer lead to feed findings into validation logic and rules-based layers that sit on top of LLM output
  • Document failure modes with reproducible examples and root-cause hypotheses
  • Build and maintain evaluation metrics (precision, recall, field coverage, hallucination rate) and report on accuracy trends
  • Flag schema gaps or ambiguities surfaced by real document variance and collaborate with data operations to refine field definitions
  • Contribute to RAG improvements by identifying where retrieved context is insufficient or misleading

Qualifications

Required

  • Hands-on experience working with LLM APIs (Anthropic, OpenAI, or similar) in a production or near-production context
  • Strong prompt engineering skills, you understand how instruction design affects model behavior, not just output tone
  • Analytical mindset with the ability to systematically isolate variables in model output quality
  • Experience designing structured test cases or evaluation frameworks (QA background is a plus)
  • Familiarity with JSON schema, structured data output, and data validation patterns
  • Ability to read and interpret complex financial or legal documents (term sheets, prospectuses, offering documents), prior financial services exposure strongly preferred
  • Strong written communication; you’ll be documenting findings for both technical and non-technical stakeholders

Preferred

  • Experience with RAG pipelines and retrieval evaluation
  • Python proficiency for scripting evaluation workflows or parsing outputs
  • Background in structured financial products (autocallables, structured notes, credit-linked notes)
  • Familiarity with evaluation frameworks or tools (e.g., LangSmith, RAGAS, custom evals)

What Success Looks Like

In 90 days, you have established a repeatable daily evaluation process, a documented baseline of field-level accuracy across product types, and have driven at least one measurable improvement in extraction quality through prompt iteration.

Why This Role

This is a high-ownership position on a strategic automation initiative with direct visibility to leadership. You won’t be maintaining someone else’s test suite, you’re building the quality layer of a system that processes real financial data at scale. The role will evolve as the system matures, with opportunity to expand into evaluation infrastructure and model improvement strategy.

Skills Required

  • Hands-on experience working with LLM APIs (Anthropic, OpenAI, or similar) in a production context
  • Strong prompt engineering skills
  • Analytical mindset for isolating variables in model output quality
  • Experience designing structured test cases or evaluation frameworks
  • Familiarity with JSON schema and data validation patterns
  • Ability to read complex financial documents
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Cincinnati, Ohio
102 Employees
Year Founded: 2018

What We Do

Luma Financial Technologies provides the leading, independent, multi-issuer platform for structured products and annuities. Luma is an award-winning platform that has been used by broker dealers and their advisors nationwide for nearly a decade to more efficiently source, configure, compare and price structured products and annuities that meet their customer’s specific investment needs. Luma's advisor-centric design, renowned education and training capabilities, fully customizable deployment and complete product lifecycle support help drive the further adoption and growth of the structured products and annuities market.

Similar Jobs

PwC Logo PwC

AI Enablement and Quality Engineer - Senior Manager

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Hybrid
24 Locations
370000 Employees
91K-322K Annually

Zeta Global Logo Zeta Global

Quality Assurance Engineer

AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
Easy Apply
Remote or Hybrid
United States
2429 Employees
150K-200K Annually

Litify Logo Litify

Quality Assurance Engineer

Cloud • Legal Tech • Professional Services • Software
Easy Apply
Remote or Hybrid
USA
160 Employees
100K-110K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account