Site Reliability Engineer (SRE)

Posted 24 Days Ago
Hiring Remotely in USA
Remote
Mid level
Information Technology • Other • Software • Consulting
The Role
The Site Reliability Engineer (SRE) will ensure system reliability and performance, automate operations, develop CI/CD pipelines, and manage cloud infrastructure.
Summary Generated by Built In
hatch I.T. is partnering with CardioOne to find a Site Reliability Engineer (SRE) to join their team. See deteails below:

About the Role:
CardioOne is seeking a highly skilled Site Reliability Engineer (SRE) to ensure the reliability, scalability, security, and performance of their production systems and services. The SRE will bridge the gap between software development and operations, implementing automation, monitoring, and best practices to enable rapid, reliable delivery of applications. You will report directly to the Senior Director of Engineering.

About the Company:
CardioOne partners with independent cardiologists to provide innovative solutions that improve patient outcomes and reduce costs. Their platform helps their physician partners thrive in today’s fee-for-service environment and prepare for success in value-based care. In February 2024, they partnered with WindRose Health Investors as well as top physician services and payor executives to grow their team and invest in their next phase of growth.

CardioOne offers a magnificent work environment, good working conditions, and competitive pay. They offer medical, dental, vision, and a 401k plan with a match to benefit eligible employees. They offer PTO (Personal Time Off) and sick time to full-time employees. They take pride in creating a culture of employee engagement that translates into an exemplary patient experience. Join them in their mission to positively impact US cardiology.

Responsibilities:

  • Ensure high availability, scalability, and performance of production systems.
  • Implement and maintain SLIs, SLOs, and SLAs for critical services.
  • Conduct capacity planning and performance tuning.
  • Automate infrastructure provisioning using IaC tools such as Terraform and Terragrunt , ansible
  • Develop automation to minimize manual operations and improve deployment workflows.
  • Build CI/CD pipelines to support rapid and reliable deployments.
  • Design and maintain monitoring, logging, and alerting systems (Datadog).
  • Participate in on-call rotations and lead incident response efforts.
  • Perform root-cause analysis and develop postmortems to prevent recurring issues.
  • Manage cloud infrastructure (AWS, Azure) and container orchestration platforms (Kubernetes, ECS).
  • Optimize system architecture for reliability and fault tolerance.
  • Implement best practices for security, networking, and service resilience.
  • Work closely with development teams to design reliable microservices and distributed systems.
  • Advocate for SRE principles and drive operational excellence across engineering teams.
  • Mentor engineers on reliability practices, tooling, and automation strategies.

Qualifications:

  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
  • 3–7 years of experience in SRE, DevOps, or Systems Engineering roles.
  • Strong proficiency with Linux systems and shell scripting.
  • Experience with cloud platforms (AWS, Azure).
  • Hands-on experience with Kubernetes/ECS and container technologies (Docker).
  • Proficiency in at least one programming language: Python or Java
  • Experience with CI/CD pipelines and DevOps tooling.
  • Strong understanding of distributed systems, networking, and security fundamentals.
  • Strong analytical and problem-solving skills.
  • Excellent communication and cross-team collaboration.
  • Ability to thrive in fast-paced, high-stakes environments.
  • A mindset focused on continuous improvement and operational excellence.

Prefered Qualifications:

  • Experience with observability stacks (OpenTelemetry).
  • Knowledge of database management (PostgreSQL).
  • Experience with configuration management tools (Ansible, Chef, Puppet).
  • Familiarity with zero-downtime deployments and chaos engineering practices.

Top Skills

Ansible
AWS
Azure
Datadog
Docker
Ecs
Java
Kubernetes
Python
Terraform
Terragrunt
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Vienna, VA
24 Employees
Year Founded: 2011

What We Do

Get behind the scenes insights from startup tech teams: https://www.myhatchpad.com/newsletter/

hatch I.T. is a specialized technology consulting firm connecting software, product, and data engineers with tech startups in emerging tech markets. We offer customized models that transform the way early-stage and high-growth startups scale. Our flagship programs include:

- Scale – technical consulting and recruiting services for high-growth startups
- Stride – technical strategy and consulting for early-stage startups
- hatchpad – an online community platform connecting startup technologists to network, learn, and advance in their careers

In true startup fashion, our roots can be traced to a garage in Leesburg, VA in 2013. While working with local startups, our Founder & CEO, Tim Winkler, realized that traditional staffing models didn’t align with the growth needs of startups. Working with those firms felt transactional and the costs were way outside a startup's budget. There was a need for a solution that was relational, community driven, and flexibly priced. With this in mind, hatch I.T. was formed, along with customized models that transform the way early-stage and high-growth startups scale.

Fast forward 8 years and 15 employees later, hatch has developed a platform that provides a roadmap to guide startups from MVP through all stages of growth. After proving this model with dozens of startups across DC, Maryland, & Virginia, we realized it was needed in all emerging startup markets.

If you’re a startup looking to grow your startup team, or an engineer looking for a career at an innovative tech company, connect with hatch I.T. today.

Similar Jobs

Jellyfish Logo Jellyfish

Site Reliability Engineer

Big Data • Cloud • Productivity • Software • Database • Analytics • Automation
Remote or Hybrid
United States
225 Employees
165K-235K Annually

Capital One Logo Capital One

Lead Software Engineer

Fintech • Machine Learning • Payments • Software • Financial Services
Remote or Hybrid
McLean, VA, USA
55000 Employees
205K-257K Annually

NBCUniversal Logo NBCUniversal

Site Reliability Engineer

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Remote or Hybrid
New York, NY, USA
68000 Employees
110K-145K Annually

Circle Logo Circle

Senior Site Reliability Engineer

Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Remote
United States of America
1050 Employees
148K-195K Annually

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account