Senior Site Reliability Engineer

Posted 3 Days Ago
Be an Early Applicant
McLean, VA, USA
In-Office
135K-150K Annually
Senior level
Software
The Role
Lead reliability for a large federal cloud platform: define SLOs, build observability, run incident response and postmortems, automate toil, design AWS/EKS infrastructure, mentor engineers, and present reliability designs to stakeholders.
Summary Generated by Built In
Senior Site Reliability Engineer

Job number: 884

This is a remote position.

Ad Hoc is a technology company that empowers organizations to deliver scalable, impactful digital services. Using modern, agile methods, our team creates products that meet people’s needs and transform their experience of government.

Work on things that matter

Our collaborations have shaped some of the defining moments in public-sector service delivery. We’ve helped build products that connect Veterans to tailored services, help millions access affordable health care, and support important programs like Head Start. As we work with agencies to deliver critical services, we’re also changing how the government approaches technology.

Built for a remote life

Our culture, communications, and tools are built for remote work, enabling us to bring together top talent nationwide. At Ad Hoc, remote life empowers our teams to design work environments that fit their lives and that foster flexibility and collaboration to achieve positive outcomes for our customers.

Committed to high expectations and a welcoming culture

Ad Hoc values acceptance, accountability, and humility. We aren’t heroes. We learn from our mistakes and improve the process for the next time. We build small, inclusive teams to collaborate closely with our partners to solve the right problems and deliver software that works.

The Veterans Affairs business unit helps transform the VA into a modern digital services organization where Veteran outcomes are at the center of every effort. We partner with the VA to design and deliver seamless user experiences for Veterans, their families and caregivers, and VA employees. By applying better practices in service design, product management, and technology, we enable the VA to increase the use, quality, and reliability of services and decrease the time Veterans spend waiting for outcomes.

Primary Responsibilities:

As a Senior Site Reliability Engineer, you will serve as an experienced individual contributor responsible for the availability, performance, and reliability of a large federal enterprise cloud platform that operates around the clock. With minimal oversight, you will help meet scope, schedule, and delivery requirements while shaping the platform's reliability strategy. Primary expectations of a Senior Site Reliability Engineer include:

  • Defining and maintaining service level objectives (SLOs), service level indicators, and error budgets, and driving the platform toward them
  • Designing and operating observability across metrics, logging, tracing, and alerting
  • Leading incident response and on-call practices, including escalation, mitigation, and time-to-recovery improvements
  • Driving blameless postmortems and systemic reliability improvements
  • Engineering automation to eliminate toil and improve operational efficiency
  • Self-directed design of reliable cloud infrastructure (AWS) and Kubernetes (Amazon EKS), including tradeoffs between cost, reliability, and efficiency
  • Building reusable modules and mentoring engineers on reliability practices
  • Presenting design documents and system diagrams to stakeholders
  • Participating in technical depth interviews with new candidates

Basic Qualifications:

  • Bachelor's and 7+ years of experience; relevant experience may be substituted for education
  • Demonstrated experience owning reliability (SLOs, observability, incident response) for production systems
  • Expert-level knowledge of at least one infrastructure-as-code tool (Terraform preferred)
  • Deep command of cloud infrastructure, containerization, and networking
  • Must be able to obtain and maintain a U.S. Public Trust / suitability determination

Preferred Qualifications:

  • Prior experience with the Department of Veterans Affairs
  • Kubernetes (Amazon EKS) and AWS at scale
  • Familiarity with FedRAMP, NIST 800-53, and zero-trust architecture
  • Relevant certifications (e.g., AWS, CKA/CKS)

To learn more about working at Ad Hoc, please visit:https://adhocteam.us/join

Benefits:

  • Company-subsidized health, dental, and vision insurance
  • Flexible PTO
  • 401K with employer match
  • Paid parental leave after one year of service
  • Employee Assistance Program

Ad Hoc LLC is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, national origin, ancestry, sex, sexual orientation, gender identity or expression, religion, age, pregnancy, disability, work-related injury, covered veteran status, political ideology, marital status, or any other factor that the law protects from employment discrimination.

We value the unique skills gained through military service and encourage veterans and transitioning service members to apply.

In support of various state and city equal pay transparency laws, Ad Hoc job descriptions feature the starting range we reasonably expect to pay to candidates who would join our team with little to no need for training on the responsibilities we've outlined above. Actual compensation is influenced by a wide range of factors including but not limited to skill set, level of experience, and responsibility. The range of starting pay for this role is $135,000-$150,000. Our recruiters will be happy to answer any questions you may have, and we look forward to learning more about your salary requirements.

job reference:

https://adhoc.team/

Skills Required

  • Bachelor's degree (or equivalent experience)
  • 7+ years of relevant experience
  • Demonstrated experience owning reliability (SLOs, observability, incident response) for production systems
  • Expert-level knowledge of at least one infrastructure-as-code tool (Terraform preferred)
  • Deep command of cloud infrastructure, containerization, and networking
  • Ability to obtain and maintain a U.S. Public Trust / suitability determination
  • Prior experience with the Department of Veterans Affairs
  • Kubernetes (Amazon EKS) and AWS at scale
  • Familiarity with FedRAMP, NIST 800-53, and zero-trust architecture
  • Relevant certifications (e.g., AWS, CKA/CKS)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: The Hague
486 Employees
Year Founded: 2014

What We Do

Ad Hoc is a digital services company that helps the federal government better serve people. Our team of experts from across commercial industry and government brings the modern skills necessary to help agencies transform public services into digital services. Our work enables agencies to meet the needs of their users while closing the gap between consumer expectations and government.

Similar Jobs

Comcast Logo Comcast

Senior Site Reliability Engineer

Digital Media • Information Technology • News + Entertainment
Hybrid
Reston, VA, USA
115000 Employees
In-Office
Washington, VA, USA
350 Employees
180K-220K Annually

Akamai Technologies Logo Akamai Technologies

Senior Site Reliability Engineer

Cloud • Security • Software • Cybersecurity
In-Office or Remote
2 Locations
10285 Employees
121K-219K Annually

Akamai Technologies Logo Akamai Technologies

Senior Site Reliability Engineer

Cloud • Security • Software • Cybersecurity
In-Office or Remote
2 Locations
10285 Employees
121K-219K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account