Principal Site Reliability Engineer

Sorry, this job was removed at 03:06 p.m. (CST) on Thursday, May 08, 2025
Be an Early Applicant
2 Locations
In-Office or Remote
126K-228K Annually
Information Technology • Software
The Role

 Leidos has an opportunity within the newly created Digital Modernization Practice Area, leading Site Reliability Engineering for the Repeatable Offerings (RO) organization. The RO organization is the delivery arm of the Digital Modernization sector’s Repeatable Offerings, delivering differentiated capabilities and managed services across the sector and the larger Leidos corporation. We are seeking a Principal Site Reliability Engineer (SRE) to lead the design, implementation, and operation of scalable, highly available systems. As a subject matter expert, you will establish best practices for reliability, security, and efficiency while driving innovation in our deployment and operations strategies. You will collaborate with development teams to improve system performance, automate processes, and ensure smooth recovery in high-pressure situations.

The team is primarily located in Blacksburg, VA, and the selected candidate will be required to either be on-site in Blacksburg or will travel frequently to that location, as well as other locations are required.

Primary Responsibilities:

  • Lead the development and execution of SRE strategies to enhance system reliability, scalability, and efficiency.
  • Manage production systems and operations, ensuring robust development and implementation processes.
  • Oversee recovery efforts for unstable or at-risk projects, applying expertise in remediation strategies.
  • Design and implement microservice architectures, including orchestrators, for high-performance distributed systems.
  • Develop, maintain, and optimize CI/CD pipelines, infrastructure as code (IaC), and automation frameworks.
  • Drive adoption of best practices for horizontal and vertical scaling of microservices.
  • Define and implement packaging and deployment strategies to support rapid and reliable software delivery.
  • Collaborate with engineering teams to improve observability, monitoring, and operational excellence.
  • Provide technical leadership in managing containerized applications and orchestration platforms.
  • Mentor and guide teams on modern reliability engineering methodologies and best practices.

Basic Qualifications:

  • Requires BS degree and 12 – 15 years of prior relevant experience or Masters with 10 – 13 years of prior relevant experience. Additional years experience are accepted in lieu of degree.
  • Proven experience as a Principal SRE or equivalent role in establishing robust and reliable systems.
  • Expertise in managing production systems and operations, including monitoring, incident response, and performance optimization.
  • Strong experience with Kubernetes and container orchestration.
  • Deep understanding of CI/CD pipelines, infrastructure as code (IaC), Helm Charts, and Operators.
  • Hands-on experience in designing and implementing microservice architecture and distributed systems.
  • Experience leading development teams in packaging and deployment strategies.
  • Strong knowledge of management strategies and techniques to support SRE principles.
  • Must have U.S. Citizenship.
  • Must be able to obtain and maintain a Public Trust clearance specific to the customer.

Preferred Qualifications:

  • Strong experience with OpenShift in enterprise environments.
  • Experience with auto-scaling, self-healing architectures, and advanced resiliency strategies.
  • Demonstrated success in improving and recovering red/unhealthy projects.
  • Familiarity with service mesh technologies and distributed tracing for monitoring and observability.
  • Expertise in designing and implementing highly available, fault-tolerant systems at scale.
  • Experience working on Federal Government contracts.

Original Posting:April 8, 2025

For U.S. Positions: While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above.

Pay Range:Pay Range $126,100.00 - $227,950.00

The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.

Similar Jobs

Easy Apply
Remote
United States
359 Employees
200K-260K Annually

DFIN Logo DFIN

Site Reliability Engineer

Fintech • Software
Remote or Hybrid
United States
1750 Employees

Veza Technologies Logo Veza Technologies

Site Reliability Engineer

Information Technology • Security • Cybersecurity
Easy Apply
Remote
USA
160 Employees
184K-240K Annually

Atlassian Logo Atlassian

Principal Software Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
In-Office or Remote
Seattle, WA, USA
11000 Employees
172K-269K Annually
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Alexandria, VA
27,104 Employees

What We Do

We Are Leidos

For 50 years we have been tackling some of the biggest problems that face our nation and our world.

OUR MISSION
Through our culture of innovation and history of performance, we develop deep customer trust built on integrity and create enduring solutions that improve our world. Leidos is a science and technology solutions leader working to address some of the world’s toughest challenges in the defense, intelligence, homeland security, civil, and healthcare markets. The company’s 43,000 employees support vital missions for government and commercial customers. Headquartered in Reston, Va., Leidos reported annual revenues of approximately $11.09 billion for the fiscal year ended January 3, 2020.

Leidos was cited for the meaningful work employees perform that is challenging, impactful, and aligned with our customers’ missions as reasons professionals want to work and stay at our company. Leidos has also been named to lists including Forbes’ Best Employers for Diversity, Forbes’ America’s Best Employers for Women, Military Times Best for Vets Employers, and Ethisphere Institute’s World's Most Ethical Companies®.

Employees enjoy career enrichment opportunities available through mobility and development and experience rewarding relationships with supportive supervisors and talented colleagues and customers. Employees appreciate our flexible work environment, allowing for and encouraging a true work-life balance. Our professionals are also excited about our Employee Resource Groups, like the newly launched Collaborative Outreach with Remote and Embedded Employees (CORE), which strives to create an environment where every employee, regardless of location, feels fully engaged as a valued employee of Leidos.

Your most important work is ahead.

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account