Stratus Medical

Data and AI Quality Automation Engineer

Posted 22 Days Ago

Be an Early Applicant

75039, Irving, TX, USA

In-Office

Senior level

Healthtech

The Role

Design and build automated data validation and monitoring systems across clinical and AI pipelines. Translate business needs into validation frameworks, implement Python/C#/SQL checks, perform root-cause analysis, and embed data quality and compliance (HIPAA/SOC2) into data lifecycles. Collaborate with stakeholders to prioritize, deploy, and document automation and AI-assisted tooling.

Summary Generated by Built In

Job Overview:

The Data and AI Automation Engineer designs and builds automated systems to ensure the accuracy, completeness, and reliability of data across Stratus’s clinical, operational, and AI-driven platforms. This role is central to delivering trusted data for analytics and decision-making within a HIPAA-regulated healthcare environment.

This job combines data engineering, quality assurance, and automation to focus on using automation to replace manual checks with scalable systems, real-time monitoring, and built-in quality controls throughout data pipelines. Engineering partners across all departments—including IT, clinical operations, business functions, and data engineering—to proactively detect issues, address root causes, and ensure data quality is embedded at every stage of the data lifecycle.

This position also supports data governance and compliance by aligning data quality practices with HIPAA and SOC 2 requirements, ensuring solutions are secure, auditable, and compliant by design.

Key Responsibilities:

Data & AI Opportunity Discovery and Execution

Conduct structured listening tours across all departments (clinical, operations, finance, IT, etc.) to identify data quality gaps, manual workflows, and AI automation opportunities
Map end-to-end data flows, dependencies, and failure points across systems (migration, microservices, BI, AI/ML pipelines)
Perform gap analysis and impact assessment, prioritizing initiatives based on risk, operational impact, and scalability
Translate business and clinical needs into clear technical requirements, validation strategies, and automation roadmaps
Own the full lifecycle from discovery → design → execution → monitoring, ensuring solutions deliver measurable outcomes
Partner with stakeholders to align priorities, success metrics, and adoption of automated and AI-driven solutions.

Automated Validation System Development

Design and implement automated data validation frameworks that scale across migration, microservice, BI, and AI/ML project types.
Develop AI-powered quality checks that learn from data patterns and surface anomalies before they reach clinical or operational systems.
Build programmatic tests and monitoring pipelines that replace manual validation workflows end-to-end.
Write Python and SQL scripts that validate complex data relationships, referential integrity, and business rules automatically.
Maintain and extend validation libraries so that new projects inherit proven quality checks from day one.

Manual Validation & Root Cause Analysis

Investigate complex data discrepancies surfaced by automated systems — dig into root cause, not just symptoms.
Perform targeted manual validation when building new automation or validating critical system migrations.
Partner with engineering and clinical teams to resolve systemic data quality issues and prevent recurrence.
Validate data accuracy and completeness during high-stakes migrations and platform changes.

AI-Assisted & Autonomous Development

Leverage agentic AI development tools (e.g., Claude, Cursor) throughout the development lifecycle — not as a novelty, but as a core productivity and quality practice.
Apply prompt engineering techniques to accelerate validation script development, anomaly analysis, and documentation.
Stay current on AI tooling advances and proactively propose where new tools can improve data quality outcomes.

Collaboration & Continuous Improvement

Partners across all departments align data requirements and ensure quality standards are proactively embedded upstream within systems and workflows.
Recommend and implement enhancements to data pipelines, validation processes, and quality monitoring dashboards.
Document data quality standards, validation patterns, and automation runbooks for team-wide use.
Contribute to Stratus's data governance practices, including alignment with HIPAA data integrity requirements.

Learning & Development

Continuously develop expertise in data engineering, AI tooling, and healthcare data standards.
Stay current on emerging validation frameworks, data quality tools, and automation best practices.

Qualifications

Education & Experience

Bachelor’s degree in computer science, Information Systems, Data Engineering, or a related field.
Minimum of five (5) years of experience in software development, data engineering, QA automation, or a closely related technical role.
Demonstrated experience building automated testing or data validation systems — not just executing test cases.
Prior experience working with healthcare, clinical, or other regulated data environments preferred.

Required Qualifications

5+ years of hands-on experience building automated data validation, QA automation, or data engineering pipelines.
Strong proficiency in C#, Python — able to write production-quality validation scripts, not just ad-hoc automation.
Strong SQL skills — able to write complex queries validating referential integrity, data relationships, and business logic across relational databases (MSSQL, MySQL, or equivalent).
Solid understanding of:
- Data structures, schemas, and dependency relationships across multi-system environments
- Data pipeline architecture and where quality controls must be embedded
- Root cause analysis methodologies for complex data discrepancies
Hands-on experience with AI-assisted development tools (e.g., Claude, Cursor, or equivalent agentic development frameworks) used meaningfully in a professional workflow, not just experimentally.
Automation-first mindset — the instinct is always to build a system, not execute a manual check.
Clear written and verbal communication skills, including the ability to document technical standards for cross-functional audiences.
Ability to work independently, manage priorities without direct oversight, and communicate proactively with distributed teams.

(Equivalent combination of education and directly demonstrated experience will be considered.)

Preferred / Nice-to-Have Skills:

Familiarity with data quality frameworks such as Great Expectations or dbt Tests.
Experience with cloud data platforms: Databricks, Snowflake, AWS, Azure, or GCP.
Experience with real-time data streaming (Kafka, Event Hub)
Knowledge of healthcare data standards: HL7, FHIR, or medical device data formats.
Experience with front-end or API testing tools (Puppeteer, Playwright, Postman).
Familiarity with JavaScript for web application data validation.
Exposure to AI/ML pipeline data quality practices — training data validation, model output monitoring.
Experience in a SOC 2–certified or HIPAA-regulated technology environment.

Soft Skills:

Insatiable curiosity — you ask, "why does this data look this way?" and dig until you understand.
Solution-oriented: you prototype and iterate rather than cataloguing reasons something can't be done.
Strong analytical and problem-solving skills with a high tolerance for data ambiguity.
Collaborative mindset — able to work across IT, clinical operations, data engineering, and business units.
Detail-oriented with a proactive approach to surfacing data quality issues before they become incidents.

Physical Requirements:

Ability to sit for extended periods of time.
Repetitive movement of fingers and hands
Talking and hearing
Reaching with hands and arms
Clarity of vision at 20 feet or less

Mental Requirements:

Read, evaluate and interpret data.
Performing Data entry mathematical operations

Work Environment:

Standard office environment

Hazards:

None

Nothing in this job description restricts management’s right to assign or reassign duties and responsibilities to this job at any time.

This job description is subject to change at any time.

Skills Required

Bachelor's degree in computer science, information systems, data engineering, or related field (or equivalent experience).
Minimum of 5 years of experience in software development, data engineering, QA automation, or closely related technical roles.
5+ years hands-on building automated data validation, QA automation, or data engineering pipelines.
Strong proficiency in C# for production-quality validation scripts.
Strong proficiency in Python for production-quality validation scripts and automation.
Strong SQL skills able to write complex queries validating referential integrity and business logic across relational databases.
Experience with relational databases such as MSSQL or MySQL (or equivalent).
Solid understanding of data structures, schemas, pipeline architecture, and dependency relationships.
Proven root cause analysis methodologies for complex data discrepancies.
Hands-on experience using AI-assisted development tools (e.g., Claude, Cursor) in professional workflows.
Automation-first mindset and ability to design scalable validation systems rather than manual checks.
Clear written and verbal communication skills and ability to document technical standards for cross-functional audiences.
Ability to work independently, manage priorities without direct oversight, and communicate with distributed teams.
Prior experience working with healthcare, clinical, or other regulated data environments (preferred).
Familiarity with data quality frameworks (e.g., Great Expectations, dbt Tests) (preferred).
Experience with cloud data platforms (Databricks, Snowflake, AWS, Azure, GCP) (preferred).
Experience with real-time streaming technologies (Kafka, Event Hub) (preferred).
Knowledge of healthcare data standards (HL7, FHIR) (preferred).
Experience with front-end or API testing tools (Puppeteer, Playwright, Postman) and JavaScript for validation (preferred).
Experience in SOC 2-certified or HIPAA-regulated technology environments (preferred).