Site Reliability Engineer I

Reposted 14 Hours Ago
Be an Early Applicant
Hyderabad, Telangana, IND
In-Office
Mid level
Healthtech
The Role
Responsible for incident management, system monitoring, automation, and collaboration with teams to ensure platform reliability and compliance in a fintech environment.
Summary Generated by Built In

NationsBenefits is at the forefront of transforming the insurance industry by developing innovative benefits management solutions. We specialize in modernizing complex back-office systems to build scalable, secure, and high-performing platforms that streamline operations for our clients.

Our strategic focus is on platform modernization, transitioning legacy systems into modern, cloudnative architectures to enhance scalability, reliability, and performance in core back-office insurance and fintech functions.

Role Overview

The Site Reliability Engineering (SRE) team is instrumental in maintaining the health, performance, and availability of our platforms. As a Site Reliability Engineer, you will play a crucial role in ensuring system reliability by monitoring metrics, managing incidents, and collaborating with Development, DevSecOps, and Engineering teams.

You will work with monitoring tools like Datadog, troubleshoot incidents in Kubernetes and cloud environments, and contribute to automation initiatives using C#, Java, or scripting languages. Your focus will be on maintaining high availability, ensuring security and compliance in fintech environments, and driving continuous service improvement.

Key Responsibilities

Incident Triage & Resolution

· Act as the first line of defense in identifying, triaging, and resolving production incidents.

· Respond to and troubleshoot alerts from monitoring tools such as Datadog.

· Perform initial root cause analysis and escalate per service level agreements (SLAs).

· Collaborate with senior engineers to resolve escalated issues and provide timely communication to stakeholders.

Monitoring & Alerting

· Proactively monitor system health, performance metrics, and service uptime using Datadog.

· Manage and optimize alerting thresholds to detect anomalies while reducing false positives.

· Monitor and troubleshoot workloads in Kubernetes, including pod restarts, log analysis, and deployment rollbacks. 24/7 Support

· Participate in a rotational shift schedule (24/7), including weekends and holidays, to ensure continuous production support.

Collaboration & Communication

· Work closely with development, operations, and engineering teams to diagnose and resolve issues.

· Provide feedback on recurring issues and recommend process or tooling improvements.

· Partner with global teams, demonstrating strong cross-cultural collaboration skills. Automation & Continuous Improvement

· Develop and maintain automation scripts or small tools using C#, Java, Python, PowerShell, or Bash.

· Contribute to CI/CD pipeline monitoring and reliability.

· Assist in building self-healing and automated recovery solutions to minimize manual intervention.

Documentation & Compliance

· Maintain comprehensive documentation of incidents, triage steps, and post-mortem analysis.

· Ensure all processes adhere to fintech compliance standards such as PCI DSS or ISO 27001.

Required Qualifications

· 3+ years of experience in Site Reliability Engineering, DevOps, or a related role.

· Experience with incident triage, resolution, and escalation processes.

· Proficiency with Datadog or similar monitoring/observability tools.

· Strong scripting or programming skills in C#, Java, Python, PowerShell, or Bash.

· Experience with Kubernetes (monitoring, troubleshooting, scaling workloads) and containerized environments like Docker.

· Familiarity with SQL, MySQL, or NoSQL databases.

· Ability to work effectively in high-pressure, high-transaction fintech environments.

· Strong written and verbal communication skills for both technical and non-technical audiences.

· Ability to work in rotational 24/7 shifts, including weekends and holidays.

Desired Skills

· Knowledge of cloud platforms such as Azure, AWS, or GCP.

· Familiarity with CI/CD pipelines, Helm charts, and deployment automation.

· Awareness of ITIL processes and agile methodologies.

· Understanding of regulatory and security compliance requirements in fintech (e.g., PCI DSS, ISO 27001).

Why Join Us?

· Competitive salary and benefits.

· Collaborative, inclusive, and growth-focused work environment.

· Opportunities for career advancement and professional development.

· Hands-on experience with cutting-edge fintech, cloud, and Kubernetes technologies.

Skills Required

  • 3+ years of experience in Site Reliability Engineering, DevOps, or a related role
  • Experience with incident triage, resolution, and escalation processes
  • Proficiency with Datadog or similar monitoring/observability tools
  • Strong scripting or programming skills in C#, Java, Python, PowerShell, or Bash
  • Experience with Kubernetes and containerized environments like Docker
  • Familiarity with SQL, MySQL, or NoSQL databases
  • Strong written and verbal communication skills for both technical and non-technical audiences
  • Ability to work in rotational 24/7 shifts, including weekends and holidays

NationsBenefits Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about NationsBenefits and has not been reviewed or approved by NationsBenefits.

  • Fair & Transparent Compensation Pay is frequently characterized as decent or good for certain entry-level and frontline roles, with timely pay also highlighted. Compensation is sometimes positioned as competitive relative to the work performed in those positions.
  • Leave & Time Off Breadth Unlimited PTO is described as available for some salaried roles, which can increase perceived flexibility. Paid holidays and paid time off are presented as part of the standard package for eligible employees.
  • Wellbeing & Lifestyle Benefits A fitness stipend and occasional company-sponsored outings or training-related perks are included among the extra benefits. These additions can modestly strengthen the overall rewards experience beyond core insurance.

NationsBenefits Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Costa Mesa, CA
115 Employees
Year Founded: 2015

What We Do

NationsBenefits® is a leading supplemental benefits company providing managed care organizations with innovative healthcare solutions helping to promote independence, health, and well-being for more than 20 million members across the U.S. When the company was founded in 2015 by Glenn Parker, M.D., we set out to disrupt the healthcare industry. In 2020, we rebranded to NationsBenefits to expand the company’s core offering and broaden the scope of our clinically focused services. Today, we surpass traditional benefit management programs by helping our health plan partners drive growth, improve outcomes, reduce costs, and delight members. Our best-in-class service model engages members in meaningful and measurable ways with technology-based solutions tailored to the unique needs of each population.

Similar Jobs

In-Office
Hyderabad, Telangana, IND
3062 Employees
In-Office
Hyderabad, Telangana, IND
677 Employees
In-Office or Remote
4 Locations
165 Employees

Similar Companies Hiring

Camber Thumbnail
Fintech • Healthtech • Social Impact
New York, New York
90 Employees
Sailor Health Thumbnail
Healthtech • Social Impact • Telehealth
New York City, NY
20 Employees
Granted Thumbnail
Mobile • Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account