Data Engineering Consultant - AI Platform

Posted 56 Minutes Ago
Be an Early Applicant
Bengaluru, Bengaluru Urban, Karnataka, IND
In-Office
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
The Role
Design, build, and operate AI-ready data platforms and scalable pipelines (batch, streaming, real-time) to support model training, inference, RAG, semantic search, and enterprise AI. Implement lakehouse/lake/warehouse architectures, data governance, security controls, DataOps/CI-CD, observability, and feature/embedding pipelines. Collaborate with AI/ML teams to enable trusted, performant, and compliant data products.
Summary Generated by Built In
Requisition Number: 2369782
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.
We're looking for a hands-on Senior Data Engineer - AI Platforms to build and scale AI-ready data platforms that power AI/ML, Generative AI, Agentic AI, analytics, and intelligent enterprise applications. This role focuses on engineering modern data platforms, data products, semantic foundations, and scalable data pipelines that enable AI systems to consume trusted, governed, and context-rich data.
The ideal candidate brings solid expertise in data engineering, distributed processing, modern cloud data platforms, and AI-centric data foundations. You will work closely with Data Architects, AI/ML Engineers, Applied Scientists, and Platform Engineers to deliver data platforms that support model training, inference, RAG, semantic retrieval, and enterprise AI applications.
Primary Responsibilities:
AI Data Platform Engineering
  • Build and enhance AI-ready data platforms supporting AI/ML, Generative AI, Agentic AI, analytics, and operational workloads
  • Develop scalable data pipelines spanning:
    • Data ingestion
    • Data transformation
    • Data processing
    • Data serving
    • Data consumption
    • Implement modern data architectures using:
    • Lakehouse
    • Data Lake
    • Data Warehouse
    • Medallion Architecture (Bronze, Silver, Gold)
  • Support data platforms that enable model training, inference, feature engineering, RAG, and enterprise AI applications

Data Engineering & Processing
  • Develop high-performance pipelines supporting structured, semi-structured, and unstructured data
  • Build batch, streaming, and real-time processing solutions using modern distributed data technologies
  • Implement scalable data processing frameworks utilizing:
    • Apache Spark
    • PySpark
    • Kafka
    • Cloud-native data services
  • Optimize data storage, partitioning, indexing, and query performance for scalability and cost efficiency
  • Implement resilient data processing patterns including checkpointing, retries, recovery mechanisms, and data validation

AI Data Foundations
  • Build and maintain AI-ready datasets, feature pipelines, and data products
  • Develop embedding generation pipelines and vectorized data preparation workflows
  • Support semantic search, retrieval, and RAG use cases through efficient data engineering practices
  • Enable AI data readiness through:
    • Data quality management
    • Feature engineering
    • Data enrichment
    • Metadata management
    • Semantic indexing
  • Contribute to building semantic data layers that provide business context and improve AI consumption of enterprise data

Data Governance & Security
  • Implement data governance standards covering:
    • Metadata management
    • Data lineage
    • Data quality
    • Data cataloging
    • Data stewardship
  • Support compliance with HIPAA, GDPR, PII protection, and enterprise governance standards
  • Implement secure data access controls using:
    • RBAC
    • Encryption
    • Data masking
    • Auditing
  • Ensure data platforms meet security, privacy, and regulatory requirements

Platform Reliability & Operational Excellence
  • Implement monitoring, logging, lineage tracking, alerting, and operational dashboards for data platforms
  • Support platform scalability, reliability, performance, and operational efficiency
  • Contribute to DataOps practices including:
    • CI/CD for data pipelines
    • Automated testing
    • Deployment automation
    • Data observability
  • Troubleshoot production issues and support continuous improvement initiatives

Collaboration & Innovation
  • Partner with AI/ML Engineers, Data Scientists, Applied Scientists, Architects, and Platform Teams to deliver AI-ready data solutions
  • Contribute to reusable engineering frameworks, shared services, and platform accelerators
  • Support adoption of emerging technologies across AI data platforms, semantic retrieval, and modern data ecosystems
  • Participate in architecture discussions and contribute to enterprise engineering standards
  • Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so

Required Qualifications:
  • Bachelor's degree in computer science, Engineering, Information Systems, Data Engineering, or related field
  • 8+ years of experience in Data Engineering, Data Platforms, Analytics Engineering, or related disciplines
  • Experience building and operating enterprise-scale data pipelines and data platforms
  • Experience implementing modern data architectures including Data Lakes, Lakehouse, Data Warehouses, and Medallion Architecture
  • Experience developing data pipelines supporting AI/ML and analytics workloads
  • Experience working with structured, semi-structured, and unstructured datasets
  • Experience with metadata management, data quality, lineage, and governance practices
  • Experience implementing CI/CD, automated testing, and DataOps practices
  • Solid experience with:
    • Databricks
    • Snowflake
    • Apache Spark
    • PySpark
    • SQL
    • Python
  • Solid understanding of distributed processing, scalability, fault tolerance, and performance optimization
  • Understanding of security, privacy, and compliance requirements for enterprise data platforms
  • Familiarity with feature engineering, embeddings, semantic search, vectorized data, and AI-ready data foundations
  • Proven solid analytical, communication, problem-solving, and collaboration skills

Preferred Qualifications:
  • Hands-on experience with Databricks Lakehouse Platform, Snowflake Data Cloud, Delta Lake, Apache Iceberg, and cloud-native data platforms
  • Experience building AI-ready data platforms that support AI/ML, Generative AI, Agentic AI, and Retrieval-Augmented Generation (RAG) workloads
  • Experience developing feature stores, embedding pipelines, semantic indexing solutions, and AI data products
  • Experience with vector databases, semantic retrieval platforms, and enterprise search solutions
  • Experience implementing batch, streaming, and event-driven architectures using Kafka and related technologies
  • Experience working with cloud platforms including Azure, AWS, or GCP
  • Experience contributing to reusable data frameworks, platform accelerators, and shared engineering services
  • Experience within healthcare, financial services, insurance, banking, or other regulated industries
  • Experience mentoring junior engineers and contributing to engineering best practices
  • Solid understanding of DataOps, data observability, automated data quality, and platform engineering practices
  • Familiarity with semantic layers, knowledge graphs, ontology-driven models, and context-aware data architectures

Technical Stack
  • Data Platforms: Databricks, BigQuery, Snowflake, Azure Synapse, Delta Lake
  • Processing: Apache Spark, PySpark, Spark SQL
  • Streaming: Kafka, Spark Streaming, Event Hub, Kinesis
  • Storage: S3, ADLS, GCS, Parquet, ORC
  • Databases: PostgreSQL, MySQL, SQL Server, Cosmos DB, NoSQL
  • AI Data Layer: Pinecone, ChromaDB, FAISS, embeddings, semantic search, RAG pipelines
  • Orchestration: Airflow, Azure Data Factory, dbt
  • Programming: Python, SQL
  • Cloud: AWS, Azure, GCP
  • DevOps: Docker, Kubernetes, CI/CD (Jenkins, GitHub Actions)
  • Observability & Governance: Monitoring, logging, lineage, data catalogs
  • Security & Compliance: RBAC, IAM, encryption, masking
  • Integration: REST APIs, microservices

At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.

Skills Required

  • Bachelor's degree in Computer Science, Engineering, Information Systems, Data Engineering, or related field
  • 8+ years of experience in Data Engineering, Data Platforms, Analytics Engineering, or related disciplines
  • Experience building and operating enterprise-scale data pipelines and data platforms
  • Experience implementing modern data architectures (Data Lakes, Lakehouse, Data Warehouses, Medallion Architecture)
  • Experience developing data pipelines supporting AI/ML and analytics workloads
  • Experience working with structured, semi-structured, and unstructured datasets
  • Experience with metadata management, data quality, lineage, and governance practices
  • Experience implementing CI/CD, automated testing, and DataOps practices
  • Solid experience with Databricks, Snowflake, Apache Spark, PySpark, SQL, and Python
  • Solid understanding of distributed processing, scalability, fault tolerance, and performance optimization
  • Understanding of security, privacy, and compliance requirements (HIPAA, GDPR, PII protection) for enterprise data platforms
  • Familiarity with feature engineering, embeddings, semantic search, vectorized data, and AI-ready data foundations
  • Proven analytical, communication, problem-solving, and collaboration skills
  • Hands-on experience with Databricks Lakehouse Platform, Snowflake Data Cloud, Delta Lake, Apache Iceberg, and cloud-native data platforms
  • Experience building feature stores, embedding pipelines, semantic indexing, and AI data products
  • Experience with vector databases and semantic retrieval platforms (Pinecone, ChromaDB, FAISS)
  • Experience implementing event-driven architectures and streaming using Kafka and related technologies
  • Experience with cloud platforms including Azure, AWS, or GCP
  • Experience mentoring junior engineers and contributing to engineering best practices
  • Experience within regulated industries (healthcare, financial services, insurance, banking)

What the Team is Saying

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Eden Prairie, MN
160,000 Employees
Year Founded: 2011

What We Do

Optum, part of the UnitedHealth Group family of businesses, is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together. At Optum, we support your well-being with an understanding team, extensive benefits and rewarding opportunities. By joining us, you’ll have the resources to drive system transformation while we help you take care of your future. We recognize the power of connection to drive change, improve efficiency and make a difference in health care. Join a team where your skills and ideas can make an impact and where collaboration is key to creating technology that produces healthier outcomes.

Gallery

Gallery
Gallery
Gallery

Optum Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Optum has three workplace models that balance the needs of the business and the responsibilities of each role. These models, core on‑site (5 days/week), hybrid (4 days/week) and telecommute or fully remote, vary by country, role and location.

Typical time on-site: Not Specified
HQEden Prairie, MN
Metro Manila, Philippines
Cebu, Philippines
Davao, Philippines
Ann Arbor, MI
Atlanta, GA
Baltimore, MD
Bengaluru, India
Chennai, India
Dallas, TX
Detroit, MI
Dublin, Ireland
Hartford, CT
Houston, TX
Hyderabad, India
Jacksonville, FL
Las Vegas, NV
Letterkenny, Ireland
Louisville, KY
Madison, WI
Minneapolis, MN
Nashville, TN
New Delhi, India
Philadelphia, PA
Phoenix, AZ
Pune, India
Raleigh, NC
San Diego, CA
Washington, DC
Learn more

Similar Jobs

Optum Logo Optum

Artificial Intelligence Engineer

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
160000 Employees

Optum Logo Optum

Machine Learning Engineer

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
160000 Employees

Optum Logo Optum

Senior Software Engineer

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
160000 Employees

Optum Logo Optum

Senior Software Engineer

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
160000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account