Senior Data Engineer

Reposted 12 Hours Ago
85281, Tempe, AZ
In-Office
Senior level
Artificial Intelligence • Healthtech • Biotech
Where Molecular Science Meets Artificial Intelligence – Revolutionizing Cancer Care.
The Role
The Senior Data Engineer designs and maintains scalable data platforms on AWS, supports analytics and machine learning workflows, and collaborates with data scientists and researchers.
Summary Generated by Built In

At Caris, we understand that cancer is an ugly word—a word no one wants to hear, but one that connects us all. That’s why we’re not just transforming cancer care—we’re changing lives.

 

We introduced precision medicine to the world and built an industry around the idea that every patient deserves answers as unique as their DNA. Backed by cutting-edge molecular science and AI, we ask ourselves every day: “What would I do if this patient were my mom?” That question drives everything we do.

 

But our mission doesn’t stop with cancer. We're pushing the frontiers of medicine and leading a revolution in healthcare—driven by innovation, compassion, and purpose.

 

Join us in our mission to improve the human condition across multiple diseases. If you're passionate about meaningful work and want to be part of something bigger than yourself, Caris is where your impact begins.

Position Summary

The Senior Data Engineer will support our precision medicine and biomarker discovery initiatives. This role is responsible for designing, building, and maintaining scalable, cloud-native data platforms and pipelines that support analytics, machine learning, and computational biology workflows across structured and unstructured, multi-modal datasets, and brings strong software engineering and data architecture expertise, deep experience with AWS cloud services, and a collaborative mindset to partner closely with data scientists, computational biologists, and R&D stakeholders.

Job Responsibilities

  • Design, build, and maintain scalable, reliable, and secure data pipelines for ingesting, transforming, storing, and serving large, multi-source and multi-omics datasets.

  • Architect and implement cloud-native data solutions on AWS to support analytics workflows, machine learning pipelines, and scientific research.

  • Develop and maintain automation frameworks for data ingestion, processing, validation, and delivery.

  • Build and deploy APIs, services, and data access layers to enable analytics and machine-learning solutions at scale.

  • Develop and deploy applications and workflows in cloud and/or HPC environments, adhering to industry best practices for system architecture, CI/CD, testing, and software design.

  • Partner closely with data scientists, computational biologists, and R&D scientists to design and evolve shared analytics platforms.

  • Optimize data systems for performance, cost efficiency, scalability, and reliability.

  • Ensure data quality, observability, and lineage across pipelines and platforms.

  • Adhere to coding, documentation, security, and compliance standards; manage technical deliverables for assigned projects.

  • Provide general informatics and platform support for laboratory research, technology development, and clinical studies.

  • Contribute to architectural decisions and mentor junior engineers as appropriate.

Required Qualifications

  • Ph.D.’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience).

  • 5+ years of professional experience in data engineering, platform engineering, or backend software engineering roles.

  • Strong proficiency in Python and experience building production-grade data pipelines and services.

  • Extensive experience designing and operating data platforms on AWS, including services such as EC2, S3, DynamoDB, EKS/ECS, Lambda, Glue, Athena, and related services.

  • Experience with Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or CDK.

  • Expertise in designing, implementing, and maintaining relational and non-relational databases (e.g., MySQL, PostgreSQL, MongoDB).

  • Extensive experience with containerization and orchestration technologies.

  • Strong proficiency with Linux and command-line–based workflows.

  • Familiarity with modern data platform concepts, including data lakes, lakehouses, streaming, and batch processing architectures.

  • Experience applying best practices in DevOps, DataOps, and/or MLOps, including CI/CD, monitoring, and automated testing.

  • Strong communication skills and the ability to collaborate effectively with multidisciplinary scientific and engineering teams.

  • Team-oriented mindset with a passion for building robust platforms that enable data-driven discovery and personalized medicine.

Preferred Qualifications

  • Familiarity with cancer biology concepts, including tumor genomics and molecular profiling workflows.

  • Experience supporting data pipelines for molecular diagnostics, biomarker discovery, or translational research.

  • Working knowledge of common molecular and clinical data types used in oncology research (e.g., NGS-derived data, variant annotations, expression matrices, clinical metadata).

  • Experience handling high-throughput sequencing–derived data and associated metadata at scale, including ingestion, normalization, and provenance tracking.

  • Understanding of bioinformatics data standards and formats (e.g., FASTQ, BAM/CRAM, VCF, GTF, or similar structured scientific data representations).

  • Familiarity with public cancer and genomics datasets (e.g., TCGA, COSMIC, cBioPortal, GEO, or equivalent resources).

  • Experience collaborating closely with computational biologists, bioinformaticians, and cancer researchers to translate research requirements into scalable data platform solutions.

  • Awareness of data quality, reproducibility, and traceability requirements in regulated or clinically adjacent oncology environments.

Physical Demands

  • Ability to sit, stand, and work at a computer for extended periods.

Training

  • All job specific, safety, and compliance training are assigned based on the job functions associated with this employee. 

Conditions of Employment:  Individual must successfully complete pre-employment process, which includes criminal background check, drug screening, credit check ( applicable for certain positions) and reference verification.

This job description reflects management’s assignment of essential functions. Nothing in this job description restricts management’s right to assign or reassign duties and responsibilities to this job at any time.

 

Caris Life Sciences is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, religion, color, national origin, gender, gender identity, sexual orientation, age, status as a protected veteran, among other things, or status as a qualified individual with disability.

Top Skills

Athena
AWS
CloudFormation
DynamoDB
Ec2
Ecs
Eks
Glue
Lambda
MongoDB
MySQL
Postgres
Python
S3
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Irving, TX
1,700 Employees
Year Founded: 2008

What We Do

Caris Life Sciences was founded in 2008 with a simple but powerful purpose – to help improve the lives of as many people as possible. With transformative technologies informed by massive amounts of big data, we are revolutionizing healthcare to provide physicians and patients with the highest quality information about their disease – from detecting it early and determining how best to treat it, to developing the next wave of novel therapies.

Similar Jobs

Zeta Global Logo Zeta Global

Senior Data Engineer

AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
Easy Apply
Remote or Hybrid
United States
2429 Employees
165K-175K Annually

Bestow Logo Bestow

Senior Data Engineer

Big Data • Fintech • Information Technology • Insurance • Software
Remote or Hybrid
US
160 Employees
135K-159K Annually

PwC Logo PwC

Managed Services - Data Quality Engineer - Senior Associate -

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Hybrid
36 Locations
370000 Employees
77K-202K Annually

Boeing Logo Boeing

Solutions Engineer

Aerospace • Information Technology • Software • Cybersecurity • Design • Defense • Manufacturing
In-Office
Mesa, AZ, USA
170000 Employees
136K-184K Annually

Similar Companies Hiring

Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account