Site Reliability Engineer at Signify Health (Remote)
Signify Health is looking for a Site Reliability Engineer to join our SRE and Release Management team. Keys to our SRE culture include teamwork, inquisitiveness, problem-solving, critical-thinking, transparency, and diversity. We are looking for a dedicated SRE who enjoys building and running distributed systems at scale in an AWS environment and appreciates the challenges and trade-offs to be made when building and deploying systems to production deployments, monitoring, scheduling, and load balancing.
We work closely with software and systems engineers to drive adoption of modern reliability practices like SLOs, error budget policies, actionable alerts, incident retrospectives, chaos testing, and end-to-end ownership.
You will discover ample opportunities for growth in many areas such as improved technology skills, effective leadership, dedicated mentorship, creative design, strong communication skills, teamwork, and more. Simply put, as an SRE, you will help Signify Health leverage high system availability and service reliability via best of breed observability techniques. Are you up for the challenge?
- Design, develop, and implement software that improves the stability, scalability, availability, and latency of Signify Health products
- Implement application/infrastructure observability solutions and perform maintenance to ensure desired application availability
- Real-time service management inclusive of building monitoring for the golden signal SLIs, establishing, negotiating SLOs with the business, building alerting, creating playbooks and runbooks for services in conjunction with development teams, product owners and support
- Triage and decompose incidents into smaller pieces, identify probable root causes using skills gained through debugging code, operating networks, building hardware, or in other, entirely unrelated domains
- Work closely with software engineers to build reliable, performant systems
- Bachelors degree in relevant technical field of study or an equivalent combination of experience and training
- 5 or more years of relevant professional experience
- Knowledge of standard methodologies related to security, performance, and disaster recovery
- Strong Knowledge working with Database Admin / Management (Examples: RDBMS, RDS, Various SQL, MongoDB, etc)
- Skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues.
- Demonstrated ability to work across teams and functions to influence design, operations and deployment of highly available software
- Strong analytical skills in support of production issue resolution and root cause identification.
- Strong organizational skills to manage a variety of work areas and cross team engagements.
- Strong experience working on high data volume applications managed with modern Infrastructure-as-Code methodologies/tooling.
- Experience with container technologies and orchestration platforms (Docker, Kubernetes, Rancher, Cloud Foundry)
- Experience managing and using CI/CD tech stack systems (Bamboo, Azure DevOps, Jenkins, CircleCi)
- Experience implementing a highly scalable/distributed CiCD Pipeline.
- Experience working with monitoring and observability tools (We use New Relic and OpsGenie )
Some Preferred Qualifications
- Strong knowledge of programming/scripting languages (Python, Bash, Groovy, Golang, IaC (Terraform). Software Engineers looking to get into SRE/Devops are encouraged to apply.
- Understanding of IT capacity management to ensure that IT resources are sufficient to meet future needs. Able to map IT resources to meet current and future requirements
- Prior Database administration background (RDBMS, RDS, Snowflake, SQL, Oracle, MongoDB, PostgreSQL etc)
Signify Health is a leading healthcare platform that leverages advanced analytics, technology, and nationwide healthcare provider networks to create and power value-based payment programs. Our mission is to transform how care is paid for and delivered so that people can enjoy more healthy, happy days at home.
We’re focused on activating the home as a key part of the care continuum, lessening dependence on facility-centric care, preventing adverse events and facilitating holistic condition management to address individuals’ total clinical, behavioral and social care needs.
Our solutions support value-based payment programs for payors, providers and other healthcare organizations by aligning financial incentives around health outcomes. We meet people where they are, helping them stay healthy and independent at home and supporting their recovery homeward as part of an episode of care.
To learn more about how we’re driving outcomes and making healthcare work better, please visit us at www.signifyhealth.com.