We are looking for a Data Engineer to join a lean, high-output team. You will work directly alongside the VP of Analytics and CTO on production systems that run 24/7 in a HIPAA-regulated environment. This is a hands-on engineering role - you will own pipelines, write stored procedures, debug integrations, and ship improvements every week. You will develop AI-powered automation as we continuously improve our data and operations activities.
- Primary Languages: Python (pandas, pyodbc, requests) and T-SQL (Azure SQL Server stored procedures, 300+ version-controlled scripts)
Data Warehouse: Azure SQL Server - primary production warehouse with 200+ tables
Cloud Infrastructure: Azure
Custom python-based ETL system.
CRM: HubSpot - deep integration across contacts, companies, deals, tickets, referrals, and referring physician hierarchies
EHR: NextGen - raw feed ingestion for patients, appointments, pharmacy, and clinical transactions
Our Pillars
- Make things easier.
- Forge genuine connections.
- Elevate the standard.
Roles and Responsibilities
Own and extend the Python-based ETL job orchestration engine: add new job types, monitor execution, and resolve production failures
Build and maintain data pipelines powering operational reporting for scheduling, finance, credentialing, and clinical operations
Integrate data sources into the warehouse - EHR (NextGen), CRM (HubSpot), HR platforms (ADP, Lever), credentialing (Modio), call center (Five9), and other third-party APIs
Optimize high-frequency SQL workloads
Support and extend custom AI agents
Maintain HIPAA compliance across all data handling: enforce access controls, audit logging, and PHI segregation in pipelines and reporting layers
Write and maintain version-controlled code in the GitHub repository
Required Qualifications
3+ years of data engineering experience in a production environment
Strong SQL skills: complex stored procedures, CTEs, window functions, temp tables, index optimization, execution plan analysis
Python proficiency for ETL workloads: pandas, pyodbc or SQLAlchemy, REST API consumption, file handling, scheduling
Hands-on experience data warehouse/ETL workloads
Experience building and maintaining scheduled data pipelines with robust error handling, retry logic, and logging
Ability to debug production failures independently, communicate status clearly, and resolve issues quickly under pressure
Comfort working in a HIPAA-regulated environment and handling PHI with appropriate care and controls
Familiarity with Azure Blob Storage or equivalent object storage for ETL staging workflows
Preferred Qualifications
HubSpot CRM data integration experience (Contacts, Deals, Tickets, Company hierarchy, API rate limit handling)
Healthcare data experience: EHR integrations (NextGen, Epic, or Cerner), credentialing systems, claims data, HIPAA BAA contexts
Experience building or maintaining AI/LLM-powered applications (Claude, OpenAI) in a production context
Familiarity with voice or telephony data pipelines, conversational AI systems, or patient intake automation
Experience with complex provider or resource scheduling systems and the data modeling they require
Exposure to CAC, LTV, or patient funnel analytics in a B2C healthcare or SaaS context
Strong communication skills - this team works directly with clinical ops, finance, and marketing stakeholders
Benefits
W2 role with competitive compensation
Medical, Dental and Vision on the first of the month after employment
Paid Vacation, Sick, and Holiday time
Employee Assistance Program (EAP) provides confidential counseling services, resources, and support to help you navigate personal or professional challenges.
401(k) plan with company contribution
Opportunity to work in a cutting-edge healthcare technology environment
Professional development opportunities and training
Collaborative and supportive work culture
Impactful role contributing to the enhancement of patient care and healthcare processes
Skills Required
- 3+ years of data engineering experience in a production environment
- Strong SQL skills: complex stored procedures, CTEs, window functions, temp tables, index optimization, execution plan analysis
- Python proficiency for ETL workloads: pandas, pyodbc or SQLAlchemy, REST API consumption, file handling, scheduling
- Hands-on experience with data warehouse and ETL workloads
- Experience building and maintaining scheduled data pipelines with robust error handling, retry logic, and logging
- Ability to debug production failures independently and communicate status clearly under pressure
- Comfort working in a HIPAA-regulated environment and handling PHI with appropriate controls
- Familiarity with Azure Blob Storage or equivalent object storage for ETL staging workflows
- Write and maintain version-controlled code in GitHub
What We Do
Headlight is a mental healthcare provider founded by psychiatrists that offers a full-circle approach to mental wellness, including therapy, medication management, and support groups. They operate through both virtual and in-person sessions, aiming to simplify access to care and improve patient outcomes by combining personalized clinical engagement with advanced technology. They are recognized as a trusted behavioral health partner in the Western U.S.








