Optum Jobs

Data Engineering Consultant - AI Platform

Optum

Data Engineering Consultant - AI Platform

Posted 56 Minutes Ago

Be an Early Applicant

Bengaluru, Bengaluru Urban, Karnataka, IND

In-Office

Senior level

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics

The Role

Design, build, and operate AI-ready data platforms and scalable pipelines (batch, streaming, real-time) to support model training, inference, RAG, semantic search, and enterprise AI. Implement lakehouse/lake/warehouse architectures, data governance, security controls, DataOps/CI-CD, observability, and feature/embedding pipelines. Collaborate with AI/ML teams to enable trusted, performant, and compliant data products.

Summary Generated by Built In

Requisition Number: 2369782
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.
We're looking for a hands-on Senior Data Engineer - AI Platforms to build and scale AI-ready data platforms that power AI/ML, Generative AI, Agentic AI, analytics, and intelligent enterprise applications. This role focuses on engineering modern data platforms, data products, semantic foundations, and scalable data pipelines that enable AI systems to consume trusted, governed, and context-rich data.
The ideal candidate brings solid expertise in data engineering, distributed processing, modern cloud data platforms, and AI-centric data foundations. You will work closely with Data Architects, AI/ML Engineers, Applied Scientists, and Platform Engineers to deliver data platforms that support model training, inference, RAG, semantic retrieval, and enterprise AI applications.
Primary Responsibilities:
AI Data Platform Engineering

Build and enhance AI-ready data platforms supporting AI/ML, Generative AI, Agentic AI, analytics, and operational workloads
Develop scalable data pipelines spanning:
- Data ingestion
- Data transformation
- Data processing
- Data serving
- Data consumption
- Implement modern data architectures using:
- Lakehouse
- Data Lake
- Data Warehouse
- Medallion Architecture (Bronze, Silver, Gold)
Support data platforms that enable model training, inference, feature engineering, RAG, and enterprise AI applications

Data Engineering & Processing

Develop high-performance pipelines supporting structured, semi-structured, and unstructured data
Build batch, streaming, and real-time processing solutions using modern distributed data technologies
Implement scalable data processing frameworks utilizing:
- Apache Spark
- PySpark
- Kafka
- Cloud-native data services
Optimize data storage, partitioning, indexing, and query performance for scalability and cost efficiency
Implement resilient data processing patterns including checkpointing, retries, recovery mechanisms, and data validation

AI Data Foundations

Build and maintain AI-ready datasets, feature pipelines, and data products
Develop embedding generation pipelines and vectorized data preparation workflows
Support semantic search, retrieval, and RAG use cases through efficient data engineering practices
Enable AI data readiness through:
- Data quality management
- Feature engineering
- Data enrichment
- Metadata management
- Semantic indexing
Contribute to building semantic data layers that provide business context and improve AI consumption of enterprise data

Data Governance & Security

Implement data governance standards covering:
- Metadata management
- Data lineage
- Data quality
- Data cataloging
- Data stewardship
Support compliance with HIPAA, GDPR, PII protection, and enterprise governance standards
Implement secure data access controls using:
- RBAC
- Encryption
- Data masking
- Auditing
Ensure data platforms meet security, privacy, and regulatory requirements

Platform Reliability & Operational Excellence

Implement monitoring, logging, lineage tracking, alerting, and operational dashboards for data platforms
Support platform scalability, reliability, performance, and operational efficiency
Contribute to DataOps practices including:
- CI/CD for data pipelines
- Automated testing
- Deployment automation
- Data observability
Troubleshoot production issues and support continuous improvement initiatives

Collaboration & Innovation

Partner with AI/ML Engineers, Data Scientists, Applied Scientists, Architects, and Platform Teams to deliver AI-ready data solutions
Contribute to reusable engineering frameworks, shared services, and platform accelerators
Support adoption of emerging technologies across AI data platforms, semantic retrieval, and modern data ecosystems
Participate in architecture discussions and contribute to enterprise engineering standards
Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so

Required Qualifications:

Bachelor's degree in computer science, Engineering, Information Systems, Data Engineering, or related field
8+ years of experience in Data Engineering, Data Platforms, Analytics Engineering, or related disciplines
Experience building and operating enterprise-scale data pipelines and data platforms
Experience implementing modern data architectures including Data Lakes, Lakehouse, Data Warehouses, and Medallion Architecture
Experience developing data pipelines supporting AI/ML and analytics workloads
Experience working with structured, semi-structured, and unstructured datasets
Experience with metadata management, data quality, lineage, and governance practices
Experience implementing CI/CD, automated testing, and DataOps practices
Solid experience with:
- Databricks
- Snowflake
- Apache Spark
- PySpark
- SQL
- Python
Solid understanding of distributed processing, scalability, fault tolerance, and performance optimization
Understanding of security, privacy, and compliance requirements for enterprise data platforms
Familiarity with feature engineering, embeddings, semantic search, vectorized data, and AI-ready data foundations
Proven solid analytical, communication, problem-solving, and collaboration skills

Preferred Qualifications:

Hands-on experience with Databricks Lakehouse Platform, Snowflake Data Cloud, Delta Lake, Apache Iceberg, and cloud-native data platforms
Experience building AI-ready data platforms that support AI/ML, Generative AI, Agentic AI, and Retrieval-Augmented Generation (RAG) workloads
Experience developing feature stores, embedding pipelines, semantic indexing solutions, and AI data products
Experience with vector databases, semantic retrieval platforms, and enterprise search solutions
Experience implementing batch, streaming, and event-driven architectures using Kafka and related technologies
Experience working with cloud platforms including Azure, AWS, or GCP
Experience contributing to reusable data frameworks, platform accelerators, and shared engineering services
Experience within healthcare, financial services, insurance, banking, or other regulated industries
Experience mentoring junior engineers and contributing to engineering best practices
Solid understanding of DataOps, data observability, automated data quality, and platform engineering practices
Familiarity with semantic layers, knowledge graphs, ontology-driven models, and context-aware data architectures

Technical Stack

Data Platforms: Databricks, BigQuery, Snowflake, Azure Synapse, Delta Lake
Processing: Apache Spark, PySpark, Spark SQL
Streaming: Kafka, Spark Streaming, Event Hub, Kinesis
Storage: S3, ADLS, GCS, Parquet, ORC
Databases: PostgreSQL, MySQL, SQL Server, Cosmos DB, NoSQL
AI Data Layer: Pinecone, ChromaDB, FAISS, embeddings, semantic search, RAG pipelines
Orchestration: Airflow, Azure Data Factory, dbt
Programming: Python, SQL
Cloud: AWS, Azure, GCP
DevOps: Docker, Kubernetes, CI/CD (Jenkins, GitHub Actions)
Observability & Governance: Monitoring, logging, lineage, data catalogs
Security & Compliance: RBAC, IAM, encryption, masking
Integration: REST APIs, microservices

At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.

Skills Required

Bachelor's degree in Computer Science, Engineering, Information Systems, Data Engineering, or related field
8+ years of experience in Data Engineering, Data Platforms, Analytics Engineering, or related disciplines
Experience building and operating enterprise-scale data pipelines and data platforms
Experience implementing modern data architectures (Data Lakes, Lakehouse, Data Warehouses, Medallion Architecture)
Experience developing data pipelines supporting AI/ML and analytics workloads
Experience working with structured, semi-structured, and unstructured datasets
Experience with metadata management, data quality, lineage, and governance practices
Experience implementing CI/CD, automated testing, and DataOps practices
Solid experience with Databricks, Snowflake, Apache Spark, PySpark, SQL, and Python
Solid understanding of distributed processing, scalability, fault tolerance, and performance optimization
Understanding of security, privacy, and compliance requirements (HIPAA, GDPR, PII protection) for enterprise data platforms
Familiarity with feature engineering, embeddings, semantic search, vectorized data, and AI-ready data foundations
Proven analytical, communication, problem-solving, and collaboration skills
Hands-on experience with Databricks Lakehouse Platform, Snowflake Data Cloud, Delta Lake, Apache Iceberg, and cloud-native data platforms
Experience building feature stores, embedding pipelines, semantic indexing, and AI data products
Experience with vector databases and semantic retrieval platforms (Pinecone, ChromaDB, FAISS)
Experience implementing event-driven architectures and streaming using Kafka and related technologies
Experience with cloud platforms including Azure, AWS, or GCP
Experience mentoring junior engineers and contributing to engineering best practices
Experience within regulated industries (healthcare, financial services, insurance, banking)

What the Team is Saying

View all jobs at Optum

View Optum Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Eden Prairie, MN

160,000 Employees

Year Founded: 2011

What We Do

Optum, part of the UnitedHealth Group family of businesses, is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together. At Optum, we support your well-being with an understanding team, extensive benefits and rewarding opportunities. By joining us, you’ll have the resources to drive system transformation while we help you take care of your future. We recognize the power of connection to drive change, improve efficiency and make a difference in health care. Join a team where your skills and ideas can make an impact and where collaboration is key to creating technology that produces healthier outcomes.

Gallery

Optum Offices

Learn More

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Optum has three workplace models that balance the needs of the business and the responsibilities of each role. These models, core on‑site (5 days/week), hybrid (4 days/week) and telecommute or fully remote, vary by country, role and location.

Typical time on-site: Not Specified

HQEden Prairie, MN

Metro Manila, Philippines

Cebu, Philippines

Davao, Philippines

Ann Arbor, MI

Atlanta, GA

Baltimore, MD

Bengaluru, India

Chennai, India

Dallas, TX

Detroit, MI