At UnitedHealthcare, we're simplifying the health care experience, creating healthier communities and removing barriers to quality care. The work you do here impacts the lives of millions of people for the better. Come build the health care system of tomorrow, making it more responsive, affordable and optimized. Ready to make a difference? Join us to start Caring. Connecting. Growing together
This role is responsible for designing, developing, and maintaining scalable and reliable data pipelines that support both batch and real-time analytics within an Azure-based data platform. The position operates as part of a collaborative data engineering team, working closely with fellow data engineers and data science & reporting partners to meet evolving data requirements. The scope of the role includes end-to-end pipeline development using Azure Data Factory, Databricks, PySpark, and streaming technologies; implementation of the Medallion (Bronze/Silver/Gold) architecture; enforcement of data quality, reliability, and performance standards; and adherence to enterprise data governance, security, and documentation practices across the data lifecycle.
You'll enjoy the flexibility to work remotely * from anywhere within the U.S. as you take on some tough challenges. For all hires in the Minneapolis or Washington, D.C. area, you will be required to work in the office a minimum of four days per week.
Primary Responsibilities:
- Pipeline Development: Design, develop, and maintain robust pipelines to ingest data from various sources (both streaming and batch) into the analytics environment using using Azure Data Factory and PySpark via Databricks. Set up real-time data ingestion using tools like Spark Structured Streaming and batch ETL jobs for periodic data loads. Ensure these pipelines are scalable, efficient, and fault-tolerant to handle growing data volumes and velocity
- Implement Data as per Medallion Architecture: Utilize the Medallion (Bronze/Silver/Gold) architecture principles to organize data processing stages. Establish raw data capture (bronze), perform cleansing and transformations (silver), and curate refined datasets for analysis and machine learning (gold). Apply best practices in each layer, such as schema enforcement and checkpointing for streaming data
- Optimize Spark jobs by tuning configurations, improving query logic, and managing resources to achieve high throughput and low latency. Address bottlenecks in streaming pipelines (e.g., by scaling clusters or tweaking batch intervals) and ensure timely data delivery. Optimize job scheduling and cluster utilization to balance timely data delivery with cost-effectiveness
- ETL Development & Maintenance: Build and maintain data pipelines with an emphasis on data cleaning steps. Integrate data from various sources (APIs, databases, file feeds, IoT streams, etc.) into the data platform, writing transformations that handle anomalies (e.g., missing or corrupt values) and standardize datasets. Collaborate with the other data engineer to share responsibility across different pipelines or sources, ensuring redundancy and knowledge transfer
- Data Quality Management: Implement comprehensive data validation rules and checks within pipelines. For example, verify schema correctness, check value ranges for sensor or health data, and ensure referential integrity where applicable. Set up automated alerts or logs that flag inconsistent or bad data, enabling quick intervention. Over time, build a library of data quality tests that run as part of the pipeline (for both streaming and batch processes) to catch issues early
- Emerging Pipeline Frameworks: Leverage modern pipeline frameworks and tools to improve development productivity. For example, use Databricks Delta Live Tables or Lakehouse pipelines to declaratively define data flows where applicable. Explore the use of Spark Declarative Lakeflow Pipelines or similar technologies to simplify the orchestration of complex data processes
- Reliability & Collaboration: Implement monitoring and alerting for pipeline health. Investigate and resolve problems such as data delays, pipeline failures, or data inconsistencies. Use logs, error messages, and analytics to identify root causes (e.g., source system changes, bug in transformation logic) and implement fixes. Work closely with the other data engineer and data science team members to understand data requirements and adjust pipelines accordingly. Document data engineering workflows and ensure proper data governance (security, privacy, access controls) is in place
- Documentation & Governance: Maintain clear documentation of data pipelines, including data source details, transformation logic, and data destination schemas. Ensure that data lineage is tracked so one can trace how data moved and changed through the system. Adhere to data governance policies - for instance, ensure sensitive data is properly masked or encrypted in non-production environments, and that access controls are in place. Work with leadership to periodically review and improve data management practices
You'll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in.
Required Qualifications:
- 3+ years of experience in data engineering role designing and implementing data pipelines and ETL processes. Should have understanding on how to handle incremental data loads and maintain history (CDC - change data capture)
- 2+ years of experience in SQL for data manipulation and query optimization. Knowledge of Python and Apache Spark (using PySpark) for building data pipelines; ability to write efficient code for batch and streaming data transformations
- 1+ years of experience using Azure, Databricks or an equivalent cloud-based data platform. Comfortable with managing clusters, using notebooks, and working with Delta Lake or Parquet files. Familiarity with cloud data services and tools for pipeline orchestration is expected
- Experience working in a team environment with agile methodologies. Ability to communicate effectively with both technical peers and non-technical stakeholders (explaining data issues in plain language). Should be comfortable using version control systems and participating in collaborative development (code reviews, pair programming when needed)
- Familiar with streaming data technologies. This could include Spark Streaming, Kafka, Azure Event Hubs, or similar platforms for real-time data ingestion.
- Demonstrated ability to detect and correct data issues - for instance, identifying when a data source has stopped updating, or when an upstream change has altered data format. Experience implementing validation checks or using frameworks to enforce data quality standards
Preferred Qualifications:
- Experience with any declarative pipeline frameworks or data workflow management tools (e.g., Databricks Delta Live Tables). This can indicate readiness to adopt advanced tools in our environment
- Experience integrating data quality checks into pipelines (such as using assertions or Great Expectations tests) to ensure accuracy and completeness of data. Familiarity with data security practices, encryption, and handling of sensitive data
- Familiarity with streaming data handling (even if assisting, should understand basics of Spark Streaming or message queue systems) is expected
- Demonstrated skill in performance tuning for Spark or SQL queries. For example, experience in partitioning strategies, caching, or troubleshooting shuffle issues to optimize heavy data workloads
*All employees working remotely will be required to adhere to UnitedHealth Group's Telecommuter Policy
Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you'll find a far-reaching choice of benefits and incentives. The salary for this role will range from $72,800 to $130,000 annually based on full-time employment. We comply with all minimum wage laws as applicable.
Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants.
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.
UnitedHealth Group is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations.
UnitedHealth Group is a drug - free workplace. Candidates are required to pass a drug test before beginning employment.
Skills Required
- 3+ years in data engineering designing and implementing data pipelines and ETL processes, including incremental loads and CDC
- 2+ years using SQL for data manipulation and query optimization; knowledge of Python and Apache Spark (PySpark) for batch and streaming transformations
- 1+ years using Azure, Databricks, or equivalent cloud data platform; managing clusters, notebooks, and working with Delta Lake or Parquet files
- Experience working in a team using agile methodologies; effective communication with technical and non-technical stakeholders; use of version control and collaborative development practices
- Familiarity with streaming data technologies such as Spark Streaming, Kafka, or Azure Event Hubs for real-time ingestion
- Demonstrated ability to detect and correct data issues and implement validation checks or frameworks to enforce data quality standards
- Experience with declarative pipeline frameworks or data workflow management tools (e.g., Databricks Delta Live Tables)
- Experience integrating data quality checks into pipelines (e.g., Great Expectations) and familiarity with data security and encryption practices
- Familiarity with streaming data handling principles and message queue systems
- Demonstrated skill in performance tuning for Spark or SQL queries (partitioning, caching, shuffle troubleshooting)
Optum Compensation & Benefits Highlights
-
Healthcare Strength — Health coverage offers copay and HSA medical options with dental, vision, company‑paid life and disability, and free or low‑cost virtual visits. Feedback suggests the offering is comprehensive and competitive on paper.
-
Parental & Family Support — Time off and family supports include PTO, eight paid holidays plus a floating day, six weeks paid parental leave, up to two weeks paid caregiver leave, Bright Horizons back‑up care, and adoption assistance up to $10,000. Feedback suggests these resources are meaningful for caregivers and family needs.
-
Retirement Support — Savings programs include a 401(k) with employer match (after one year, vesting after two) and a 10%‑discount Employee Stock Purchase Plan. These programs bolster long‑term financial security when combined with other savings resources.
Optum Insights
What We Do
Optum, part of the UnitedHealth Group family of businesses, is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together. At Optum, we support your well-being with an understanding team, extensive benefits and rewarding opportunities. By joining us, you’ll have the resources to drive system transformation while we help you take care of your future. We recognize the power of connection to drive change, improve efficiency and make a difference in health care. Join a team where your skills and ideas can make an impact and where collaboration is key to creating technology that produces healthier outcomes.
Gallery
Optum Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.
Optum has three workplace models that balance the needs of the business and the responsibilities of each role. These models, core on‑site (5 days/week), hybrid (4 days/week) and telecommute or fully remote, vary by country, role and location.