API Management Platform - Cloud Reliability Engineering Lead

Reposted 3 Days Ago
Be an Early Applicant
Hiring Remotely in Hartford, CT
In-Office or Remote
165K-247K Annually
Expert/Leader
Fintech • Payments • Financial Services
The Role
The role involves leading reliability engineering strategies for API Hosting Platforms, managing incident response, team leadership, and driving automation and best practices across cloud services.
Summary Generated by Built In
Director & Reliability Engineering - IE06IE

We’re determined to make a difference and are proud to be an insurance company that goes well beyond coverages and policies. Working here means having every opportunity to achieve your goals – and to help others accomplish theirs, too. Join our team as we help shape the future.   

         

API Management Platform - Cloud Reliability Engineering Lead

Description

The Hartford’s Cloud Services team is seeking an experienced and highly motivated Reliability Engineering Lead who will be responsible for driving the reliability, scalability, and performance of

API Hosting Platforms across multiple cloud providers.  This hands-on leader will build a team responsible for engineering and operational practices that ensure our Cloud API Platforms are secure, resilient, observable, and continuously available.

The Reliability Engineering Lead will partner across teams to champion modern reliability practices, guide technical roadmaps, and build a culture of operational excellence.

Job Level: Director & Reliability Engineer, T6

Responsibilities

  • Lead the design and implementation of reliability strategies across the API hosting platform, including availability, performance, capacity planning, and operational readiness.
  • Define and enforce reliability standards, SLIs/SLOs, and error budgets for platform services and customer-facing APIs.
  • Oversee incident management, ensuring strong triage, root-cause analysis, and preventive action development for Platform issues.
  • Drive automation to reduce manual operations, improve deployment safety, and strengthen platform secure baselines.
  • Establish and maintain robust observability practices, including logging, metrics, tracing, and synthetic monitoring.
  • Build and Lead a team of reliability engineers, providing mentorship, coaching, and technical direction.
  • Work with application owners to prioritize reliability-focused backlog items and improve platform health over time.
  • Identify and implement cost savings opportunities
  • Serve as a subject‑matter expert for reliability engineering best practices across the organization.
  • Collaborate with security teams to ensure platform compliance with enterprise security standards.
  • Integrate security practices into CI/CD workflows and platform architecture.
  • Participate in risk assessments, audits, and compliance reviews for API platform services.
  • Advocate for modern reliability practices (e.g., chaos engineering, resilience testing, auto‑remediation).
  • Evaluate and introduce new technologies, tooling, and methodologies to keep platform operations modern and efficient.
  • Monitor industry trends and translate them into actionable platform improvements.

Qualifications

  • 8+ years of technical experience, engineering, platform management and operations roles with a demonstrated track record of technical innovation and experience leading technically diverse teams.
  • Strong cloud engineering mindset with cloud experience across public cloud providers and the technologies most frequently used in engineering and managing highly reliable and automated technology environments.
  • Strong experience with API management or hosting platforms (Apigee, AWS API Gateway)
  • Expertise with cloud-native technologies (Kubernetes, containers, distributed systems).
  • Deep knowledge of performance and observability tools such as Dynatrace, Splunk, CloudWatch, Cloud Trail, and related tools.
  • Proven track record leading engineering teams or technical initiatives.
  • Strong understanding of CI/CD, release automation, and DevOps tooling.
  • Excellent communication, stakeholder management, and problem‑solving skills.
  • Knowledge of networking fundamentals, API security, and Zero Trust principles.
  • Experience with incident command roles in major incident processes.
  • Strong knowledge and experience with cloud product management, cloud engineering, and Agile principles.
  • Strong Experience with automation tools such as Ansible and Terraform
  • Exceptional critical thinking and problem-solving skills.
  • Able to influence diverse teams and build strong business relationships.

Compensation

The listed annualized base pay range is primarily based on analysis of similar positions in the external market. Actual base pay could vary and may be above or below the listed range based on factors including but not limited to performance, proficiency and demonstration of competencies required for the role. The base pay is just one component of The Hartford’s total compensation package for employees. Other rewards may include short-term or annual bonuses, long-term incentives, and on-the-spot recognition. The annualized base pay range for this role is:

$164,800 - $247,200

Equal Opportunity Employer/Sex/Race/Color/Veterans/Disability/Sexual Orientation/Gender Identity or Expression/Religion/Age

About Us | Our Culture | What It’s Like to Work Here | Perks & Benefits

Top Skills

Ansible
Api Management
Ci/Cd
Cloud Computing
Cloud Trail
Cloudwatch
Containers
DevOps
Distributed Systems
Dynatrace
Kubernetes
Splunk
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Hartford, Connecticut
20,002 Employees
Year Founded: 1810

What We Do

Human achievement is at the heart of what we do. We put our belief into action by not only ensuring individuals and businesses are well protected, but by going even further – making an impact in ways that go beyond an insurance policy

Similar Jobs

Advisor360 Logo Advisor360

Enterprise Account Executive

Artificial Intelligence • Fintech • Software • Financial Services • Generative AI • Big Data Analytics • Automation
Remote
United States
500 Employees
180K-200K Annually

zLinq Logo zLinq

Solutions Engineer

Cloud • Information Technology • Other • Sales • Software • Consulting
Easy Apply
Remote
USA
45 Employees
112K-140K Annually

Circle (Community) Logo Circle (Community)

Head of Media

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Easy Apply
Remote
31 Locations
250 Employees
150K-220K Annually

AKASA Logo AKASA

Sr. Machine Learning Researcher

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Software • Generative AI
Remote
United States
100 Employees
175K-230K Annually

Similar Companies Hiring

Granted Thumbnail
Mobile • Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account