Lead Site Reliability Engineer

Posted Yesterday
Be an Early Applicant
Gurugram, Haryana
In-Office
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
The Role
Lead initiatives to improve operational efficiency in cloud software solutions by integrating monitoring tools, enhancing automation, and ensuring system reliability.
Summary Generated by Built In
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.
What makes this a one of kind opportunity? We have more than 12,000 technology colleagues serving the IT needs of our clients across the globe and our own Fortune 6 IT needs. At Optum, you'll be encouraged to combine your passion and technical expertise to help us shape the health care system for years to come. You'll help change the way our businesses and consumers engage with technology across a wide platform of health services and delivery systems by setting team goals, forecasting resource needs, and guiding solutions developed to solve business and operational challenges. If you're out to make a difference, apply today.
Medicare & Retirement (M&R) | Community and State | Individual and Family Plan - Technology Operations needs an experienced Senior Site Reliability Engineer (SRE) acting as a bridge between software engineering and IT operations. The primary goal of this role is to maintain software applications/Infrastructure that are reliable, scalable, resilient and to improve performance and operational efficiency along with ensuring all business-critical products having implemented right tools and executed exercise to validate system availability, latency, performance, efficiency, monitoring, incident priority, and capacity planning. This role will enable Government Programs (M&R, C&S and IFP) Technology Operations to meet our business segment's needs as an IT partner and advocate.
Primary Responsibilities:
  • Defining and setting up best industry alert and monitoring practices across line of business and design/architect efficient monitoring dashboards on Splunk/Dynatrace /Grafana common for all applications/products across line of business
  • Participating in 5-9 program and other peak season readiness initiatives and collaboration with application teams evaluating applications from resiliency, availability, and reliability perspective
  • Act as a gatekeeper for changes rolling into production
  • Embrace continuous learning of engineering practices to ensure industry best practices and technology adoption, including DevOps, Cloud and Agile thinking
  • Tech debt reduction/Tech transformation including opensource/inner source adoption, Cloud adoption, HCP assessment and adoption
  • Improve processes/runbooks and lead automation efforts of any manual items around support cutting down manual toil
  • Participate in on-call rotation
  • Improve operational tooling, frameworks, perform chaos engineering activities
  • Respond to platform emergencies, alerts, and escalations from Customer Support
  • Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so

Required Qualifications:
  • Undergraduate degree or equivalent experience
  • 10+ years of experience in IT industry across entire SDLC
  • 5+ years of experience in integrating monitoring and alerting into cloud software solutions
  • 3+ years of coding experience with one or more of the follow languages Java, C#, C/C++, Go, Python, Perl, PowerShell or JavaScript with a willingness and ability to learn new ones
  • 3+ years of experience in Splunk / Dynatrace / DataDog/Grafana/ Telemetry or similar for monitoring tools
  • 2+ years of experience building and programmatically consuming REST APIs
  • ServiceNow experience
  • Work experience as a Site Reliability Engineer or similar role
  • Experience with any database
  • Experience in operations support for any application
  • Experience with programmatic interaction with a relational database SQL Server/MySQL/PostgreSQL
  • Experience planning and supporting 99.999% availability against critical applications in production
  • Knowledge of any scripting or programming language
  • Technical writing skills (creating flow diagrams, end user documentation, etc)
  • Solid understanding of engineering fundamentals: unit testing, performance testing, code reviews, telemetry, agile and DevOps
  • Solid understanding of: continuous integration / continuous delivery tools, serverless architecture, containerization, public / private cloud, application observability and/or messaging / stream architecture
  • Proven ability to communicate effectively to both technical and non-technical, globally distributed audiences

At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.

Top Skills

C#
C/C++
Datadog
Dynatrace
Go
Grafana
Java
JavaScript
MySQL
Perl
Postgres
Powershell
Python
Rest Apis
Servicenow
Splunk
SQL Server

What the Team is Saying

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Eden Prairie, MN
160,000 Employees
Year Founded: 2011

What We Do

Optum, part of the UnitedHealth Group family of businesses, is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.

At Optum, we support your well-being with an understanding team, extensive benefits and rewarding opportunities. By joining us, you’ll have the resources to drive system transformation while we help you take care of your future.

We recognize the power of connection to drive change, improve efficiency and make a difference in health care. Join a team where your skills and ideas can make an impact and where collaboration is key to creating technology that produces healthier outcomes.

Gallery

Gallery
Gallery
Gallery

Optum Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Optum has three workplace models that balance the needs of the business, the responsibilities of a job, and your preference for more flexibility. These models are core on-site 5 days/week, hybrid 3 day/s week and telecommute or fully remote.

Typical time on-site: Flexible
HQEden Prairie, MN
Ann Arbor, MI
Atlanta, GA
Baltimore, MD
Belfast, GB
Dallas, TX
Detroit, MI
Hartford, CT
Houston, TX
Jacksonville, FL
Las Vegas, NV
Louisville, KY
Madison, WI
Minneapolis, MN
Nashville, TN
Philadelphia, PA
Phoenix, AZ
Raleigh, NC
San Diego, CA
Washington, DC
Learn more

Similar Jobs

Optum Logo Optum

Manager Digital Product - Quality Audit

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
3 Locations
160000 Employees
60K-90K Annually

Optum Logo Optum

Associate Capability Analyst - Quality Auditor

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
3 Locations
160000 Employees

Optum Logo Optum

Associate Capability Analyst - Quality Auditors

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
3 Locations
160000 Employees

Optum Logo Optum

Director Digital Product - AI Quality Assurance and Testing

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
2 Locations
160000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account