Senior Site Reliability Engineer

Reposted 9 Days Ago
Hiring Remotely in San Diego, CA, USA
In-Office or Remote
Senior level
Information Technology
The Role
The Senior Site Reliability Engineer is responsible for architecting reliability strategies, implementing SRE frameworks, mentoring engineers, and ensuring system resilience and performance in government systems.
Summary Generated by Built In

Company Overview:
Arctiq is a global, intelligence-driven technology services company delivering professional and managed services across Hybrid Cloud Infrastructure, Networking & Connected Experiences, Cybersecurity, Data & AI, Autonomous Operations & Intelligence, and Enterprise Service Management. We help organizations operate, secure, and modernize complex environments by unifying infrastructure, networking, data, security, automation, and observability under a single, integrated operating model. Our work focuses on helping customers reduce operational friction, improve resilience, and make better, faster decisions as their environments evolve. Arctiq builds on decades of industry expertise and a customer-centric ethos to deliver exceptional value to clients across diverse industries.

This is a remote, contract opportunity for a project Arctiq is delivering for a client. Candidates must have or be able to obtain a Secret Clearance.

This position requires U.S. citizenship due to government security clearance and/or federal contract requirements


This is a remote opportunity with preference to candidates located in San Diego, CA, Norfolk, VA or Charleston, SC

Position Overview:

The Senior Site Reliability Engineer is a technical leader responsible for architecting the reliability strategy for large-scale, distributed government systems. You will lead the implementation of the SRE framework, driving the adoption of SLO-based management and advanced automation. As a subject matter expert, you will mentor mid-level engineers and interface with government stakeholders to ensure system resilience and performance meet mission requirements.

Responsibilities: 

  • Reliability Architecture: Define the strategy for Service Level Objectives (SLOs) and Error Budgets. Design complex telemetry pipelines for full-stack observability.
  • Strategic Automation: Design and govern the enterprise Infrastructure as Code (IaC) standards. Develop custom tooling to automate complex recovery procedures and system scaling.
  • Incident Command: Act as the Incident Commander for major system outages, leading the technical response and directing the Root Cause Analysis (RCA) process.
  • Security & Compliance: Lead the integration of security-as-code within DevSecOps pipelines, ensuring full compliance with RMF and NIST 800-53 standards.
  • Mentorship: Provide technical guidance and mentorship to Mid-Level SREs and developers, fostering a culture of reliability across the organization.

Qualifications:

  • 7+ years of experience in SRE or DevOps, with significant experience in distributed systems.
  • Expertise in Go, Python, or Java and advanced knowledge of Linux internals.
  • Extensive experience managing production Kubernetes environments and complex cloud architectures.
  • Proven track record of defining and meeting SLOs for high-availability systems.
  • Experience navigating government Risk Management Framework (RMF) processes.
  • Education: Bachelor’s or Master’s degree in Computer Science or Engineering.
  • Certifications: CKA (Certified Kubernetes Administrator) and industry observability certification preferred
  • Strong Communication
  • Comfortable Interacting with Leadership
  • Promoting Best Practices
  • Understanding Operations and Procedures to help lead the team
  • 7+ Years of Leadership Experience
  • Executive Presence

Arctiq is an equal opportunity employer. If you need any accommodations or adjustments throughout the interview process and beyond, please let us know. We celebrate our inclusive work environment and welcome members of all backgrounds and perspectives to apply.

 

We thank you for your interest in joining the Arctiq team! While we welcome all applicants, only those who are selected for an interview will be contacted.

Skills Required

  • 7+ years of experience in SRE or DevOps, with significant experience in distributed systems
  • Expertise in Go, Python, or Java and advanced knowledge of Linux internals
  • Extensive experience managing production Kubernetes environments and complex cloud architectures
  • Proven track record of defining and meeting SLOs for high-availability systems
  • Experience navigating government Risk Management Framework (RMF) processes
  • Bachelor's or Master's degree in Computer Science or Engineering
  • CKA (Certified Kubernetes Administrator) and industry observability certification preferred
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Irvine, California
377 Employees

What We Do

Arctiq is a leader in professional IT services and managed services across three core Centers of Excellence: Enterprise Security, Modern Infrastructure and Platform Engineering. Renowned for our ability to architect intelligence, we connect, protect, and transform organizations, empowering them to thrive in today's digital landscape. Arctiq builds on decades of industry expertise and a customer-centric ethos to deliver exceptional value to clients across diverse industries.

Similar Jobs

Circle (circle.so) Logo Circle (circle.so)

Senior Site Reliability Engineer

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Easy Apply
Remote
31 Locations
250 Employees
130K-140K Annually

Coinbase Logo Coinbase

Senior Site Reliability Engineer

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Easy Apply
Remote
USA
4700 Employees
186K-219K Annually

Coinbase Logo Coinbase

Senior Site Reliability Engineer

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Easy Apply
Remote
USA
4700 Employees
186K-219K Annually

Order.co Logo Order.co

Senior Site Reliability Engineer

eCommerce • Fintech • Payments • Software
Remote or Hybrid
United States
120 Employees
175K-200K Annually

Similar Companies Hiring

Scrunch  Thumbnail
Artificial Intelligence • Information Technology • Marketing Tech • Software • SEO
Salt Lake City, Utah
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account