Senior ETL Pipeline Engineer

Sorry, this job was removed at 02:12 p.m. (CST) on Monday, Aug 11, 2025
Be an Early Applicant
Papago, AZ, USA
In-Office
Artificial Intelligence • Healthtech • Biotech
Where Molecular Science Meets Artificial Intelligence – Revolutionizing Cancer Care.
The Role

At Caris, we understand that cancer is an ugly word—a word no one wants to hear, but one that connects us all. That’s why we’re not just transforming cancer care—we’re changing lives.

 

We introduced precision medicine to the world and built an industry around the idea that every patient deserves answers as unique as their DNA. Backed by cutting-edge molecular science and AI, we ask ourselves every day: “What would I do if this patient were my mom?” That question drives everything we do.

 

But our mission doesn’t stop with cancer. We're pushing the frontiers of medicine and leading a revolution in healthcare—driven by innovation, compassion, and purpose.

 

Join us in our mission to improve the human condition across multiple diseases. If you're passionate about meaningful work and want to be part of something bigger than yourself, Caris is where your impact begins.

Position Summary
The Senior ETL Pipeline Engineer is a Senior level role within Caris Life Sciences responsible for productionalizing and maintaining existing pipelines, designing and implementing new ones, automating data quality checks, and enabling smooth handoffs to downstream teams. You will work with large, complex datasets and collaborate closely with upstream data providers and downstream users to ensure our pipelines meet scientific and operational needs. This role requires a self-starter who can independently drive projects from concept through deployment.
Job Responsibilities

  • Deliver high-quality, maintainable, and reproducible pipelines that meet production standards, including logging, error handling, and modular design to support genomics research

  • Productionalize existing ETL pipelines to enable consistent, repeatable execution.

  • Create new data pipelines using tools like AWS Step Functions and Metaflow to support evolving research and analysis goals.

  • Implement automated QC/QA steps to ensure data accuracy, completeness, and reproducibility.

  • Monitor pipeline performance and proactively address data anomalies or failures.

  • Work closely with upstream data providers (e.g., lab systems, sequencing platforms) to understand data formats and delivery schedules.

  • Partner with downstream consumers (e.g., data scientists, bioinformaticians, clinical teams) to ensure data usability and accessibility.

  • Integrate pipelines with AWS-based infrastructure, primarily writing outputs to S3.

  • Lead technical decision-making, advocate for best practices, and see projects through to completion.

  • Meet all assigned targets and goals set by management.

  • Perform other related duties as assigned.

  • Stay current with emerging technologies in data engineering, genomics, and cloud computing.

  • Take ownership of assigned projects through deployment and monitoring.

  • Communicate progress, risks, and blockers to ensure timely delivery of milestones.

Required Qualifications

  • Bachelor's degree from an accredited university or equivalent work experience in a related field.

  • 6+ years in data engineering, with a track record of building robust, production-grade pipelines, having owned and delivered complex ETL systems end-to-end, not just scripts or prototypes.

  • Proficient in Python for data processing and workflow development (e.g., using pandas, boto3, etc.).

  • Comfortable navigating and contributing to large, modular codebases that use object-oriented design principles to promote reuse and maintainability.

  • Strong software engineering fundamentals — version control, testing, code review, and documentation are second nature

  • Solid SQL skills — able to read from and query SQL databases effectively.

  • Experience with AWS, particularly S3, Step Functions, and general cloud-based data workflows.

  • Experience working with large, complex datasets in a production environment

  • Familiarity with workflow management tools such as Metaflow.

  • Demonstrated ability to work independently, prioritize tasks, and deliver results with minimal supervision.

  • Proven ability to collaborate with upstream data providers and downstream users to troubleshoot issues and ensure reliable data delivery

Preferred Qualifications

  • Exposure to genomics, molecular biology, or biomedical datasets (e.g., VCF, gene expression matrices, variant annotation tables) is a strong plus.

  • Experience with Athena, Glue, or similar AWS data catalog/query tools

  • Knowledge of data lineage, reproducibility, and scientific computing best practices.

  • Familiarity with infrastructure-as-code or containerization tools (e.g., Terraform, Docker, CDK).

Training

  • All job specific, safety, and compliance training are assigned based on the job functions associated with this employee.

Conditions of Employment:  Individual must successfully complete pre-employment process, which includes criminal background check, drug screening, credit check ( applicable for certain positions) and reference verification.

This job description reflects management’s assignment of essential functions. Nothing in this job description restricts management’s right to assign or reassign duties and responsibilities to this job at any time.

 

Caris Life Sciences is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, religion, color, national origin, gender, gender identity, sexual orientation, age, status as a protected veteran, among other things, or status as a qualified individual with disability.

Similar Jobs

PwC Logo PwC

Systems Engineer

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Hybrid
57 Locations
370000 Employees
155K-410K Annually

MetLife Logo MetLife

Account Manager

Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Hybrid
Tempe, AZ, USA
43000 Employees
70K-86K Annually

Silverfort Logo Silverfort

Account Manager

Information Technology • Sales • Security • Cybersecurity • Automation
Remote or Hybrid
United States
507 Employees

Applied Systems Logo Applied Systems

Site Reliability Engineer

Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
Remote or Hybrid
2 Locations
3040 Employees
65K-135K Annually
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Irving, TX
1,700 Employees
Year Founded: 2008

What We Do

Caris Life Sciences was founded in 2008 with a simple but powerful purpose – to help improve the lives of as many people as possible. With transformative technologies informed by massive amounts of big data, we are revolutionizing healthcare to provide physicians and patients with the highest quality information about their disease – from detecting it early and determining how best to treat it, to developing the next wave of novel therapies.

Similar Companies Hiring

GC AI Thumbnail
Artificial Intelligence • Legal Tech
San Mateo, California
80 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account