Site Reliability Engineer - Observability at Amwell (Remote)
Amwell is a leading telehealth platform in the United States and globally, connecting and enabling providers, insurers, patients, and innovators to deliver greater access to more affordable, higher quality care. Amwell believes that digital care delivery will transform healthcare. We offer a single, comprehensive platform to support all telehealth needs from urgent to acute and post-acute care, as well as chronic care management and healthy living. With over a decade of experience, Amwell powers telehealth solutions for over 150 health systems comprised of 2,000 hospitals and 55 health plan partners with over 36,000 employers, covering over 80 million lives.
Amwell is building a new Observability team to drive visibility and performance optimizations into our Converge platform. The Observability team is responsible for providing frameworks, tools, APIs and visualizations to allow all Amwell engineers and developers to better understand the behavior of features, services, and infrastructure they own and maintain. The Observability team helps teach product, support, and systems teams how to appropriately monitor the platform and provides visualization for monitoring distributed systems, reduce operational overhead, and deliver unmatched availability to our customers.
- Design, implement, maintain, and deploy observability-related systems covering application & infrastructure telemetry and logs, metric visualizations and alerting management
- Define best practices around making our systems and services observable and work with teams to get those best practices applied
- Collaborate with Engineering, Platform and SRE teams to ensure our services, platforms and infrastructure are emitting the right metrics
- Collect, aggregate, and visualize the collected metrics to provide actionable insights
- Track System Scalability & Performance Metrics across various applications, databases, and integration end points
- Establish processes and frameworks for production bottleneck identification, and resolutions of major issues
- Creating and maintaining documentation of new systems and processes/procedures
- Ensure strict compliance with regulations such as HIPAA and PCI
- Lead Infrastructure projects that improve observability tools and platforms: metrics, dashboarding, alerting, logging, application performance management and distributed tracing
- Develop tools, dashboard alerts, and training that quickly enable team with deeper insights into application performance and service health issues reducing their MTTR and MTTD.
- 5+ years of experience in software engineering, SRE, or DevOps
- Experience with one or more of the following - Elastic, CloudWatch, Prometheus
- Experience with development and deployment in a hosted cloud environment like AWS and GCP
- Experience operating and utilizing Observability and Monitoring tools
- Familiar with system performance, scalability, architecture and design concepts
- Experience with medium to large metrics datasets
- Deep knowledge of logs, metrics, traces and alerts, ability to obtain the right data to help teams make decision quickly
- Well-versed in container deployment and orchestration technologies at scale with knowledge of the fundamentals to include service discovery, deployments, monitoring, scheduling and load balancing
- An ability to work well as part of a distributed team
- A mindset focused on reliability and customer satisfaction
Working at Amwell
Amwell is changing how care is delivered through online and mobile technology. We strive to make the hard work of healthcare look easy. To make this a reality, we look for people with a fast-paced, mission-driven mentality. We’re a culture that prides itself on quality, efficiency, smarts, initiative, creative thinking, and a strong work ethic.
Our Core Values include One Team, Customer First, and Deliver Awesome. Customer First and Deliver Awesome are all about our product and services and how we strive to serve. As part of One Team, we operate the Amwell Cares program, which brings needed assistance to our communities, whether that be free healthcare for the underserved or for people affected by natural disasters, support for equality, honoring doctors and nurses, or annual Amwell-matched donations to food banks. Amwell aims to be a force for good for our employees, our clients, and our communities.
Amwell cares deeply about and supports Diversity, Equity, and Inclusion. These initiatives are highlighted and reflected within our Three DE&I Pillars - our Workplace, our Workforce, and our community.
Amwell is a "virtual first" workplace, which means you can work from anywhere, coming together physically for ideation, collaboration, and client meetings. We enable our employees with the tools, resources, and opportunities to do their jobs effectively wherever they are! Amwell has collaboration spaces in Boston, Tysons Corner, Portland, Woodland Hills, and Seattle.
- Unlimited Personal Time Off (Vacation time)
- 401K match
- Competitive healthcare, dental and vision insurance plans
- Paid Parental Leave (Maternity and Paternity leave)
- Employee Stock Purchase Program
- Free access to Amwell’s Telehealth Services, SilverCloud and The Clinic by Cleveland Clinic’s second opinion program
- Free Subscription to the Calm App
- Tuition Assistance Program
- Pet Insurance