UMBRA - JDE-High - Principal Site Reliability Engineer

Posted 4 Days Ago
Be an Early Applicant
Fort Meade, MD
In-Office
Senior level
Information Technology • Security • Software
The Role
Manage daily operations of a classified NOC, focusing on Kubernetes services, incident response, system monitoring, and ensuring security and availability.
Summary Generated by Built In

Clarity Innovations is a trusted national security partner, dedicated to safeguarding our nation’s interests and delivering innovative solutions that empower the Intelligence Community (IC) and Department of Defense (DoD) to transform data into actionable intelligence, ensuring mission success in an evolving world.

Our mission-first software and data engineering platform modernizes data operations, utilizing advanced workflows, CI/CD, and secure DevSecOps practices. We focus on challenges in Information Warfare, Cyber Operations, Operational Security, and Data Structuring, enabling end-to-end solutions that drive operational impact.

We are committed to delivering cutting-edge tools and capabilities that address the most complex national security challenges, empowering our partners to stay ahead of emerging threats and ensuring the success of their critical missions. At Clarity, we are people-focused and set on being a destination employer for top talent, offering an environment where innovation thrives, careers grow, and individuals are valued. Join us as we continue to lead innovation and tackle the most pressing challenges in national security.

Position Overview

The Network Operations Center Engineer assists the NOC Lead to manage and oversee the daily operations of an 8am - 5pm EST classified cloud development environment, with a strong emphasis on maintaining Kubernetes-hosted services.  The NOC Engineer is responsible for coordinating incident response, system monitoring, team leadership, performance reporting, and ensuring the development environment’s security and availability.

Key Responsibilities
  • Carry out day-to-day operations of the classified NOC, ensuring adherence to service level agreements and system uptime requirements

  • Perform monitoring and support of cloud-based systems, networks, and containerized applications in Kubernetes clusters

  • Coordinate incident response, troubleshooting, and escalation procedures

  • Ensure timely detection, resolution, and documentation of service-impacting events

  • When NOC lead is absent, act as the primary point of contact for cloud system alerts, outages, and classified network incidents; communicate status to stakeholders and leadership

  • Ensure 24/7 observability of network, platform, and container-level components using tools such as Prometheus, Grafana, Fluentd, and Elastic Stack

  • Draft technical guidance for NOC staff and collaborate with engineering, cybersecurity, and cloud teams

  • Maintain situational awareness of the system through dashboards, logs, and proactive monitoring tools

  • Develop and maintain standard operating procedures, incident response plans, runbooks, and shift logs

  • Assist NOC lead conducting daily stand-ups, shift handovers, and weekly ops reviews

  • Generate operational metrics and performance reports

  • Ensure compliance with federal security policies and contribute to continuous accreditation of the cloud system under RMF

  • Perform readiness drills, after-action reviews, and contribute to lessons-learned activities

Qualifications
  • Must be able to obtain and maintain a TS/SCI security clearance (note, only US Citizens are eligible for security clearances)

  • Expertise in cloud infrastructure (AWS GovCloud, Azure Government, or C2S/C2E/JWCC), virtualization, and hybrid environments

  • Understanding of secure networking, load balancers, DNS in cloud-native architectures, and inter-cluster communication

  • Operational experience with Kubernetes, containerized workloads, and supporting technologies (Docker, Helm, Fluentd, Kustomize)

  • Strong understanding of monitoring tools (e.g., Prometheus, Grafana, ELK Stack) and ticketing systems (e.g., osTicket, Jira)

  • Familiarity with GitOps workflows and infrastructure as code using Terraform or Flux

  • Familiarity with DoD/IC cybersecurity compliance standards, ATO processes, and classified system governance

  • Excellent communication skills and the ability to clearly brief complex operational topics to leadership and mission partners

Preferred Qualifications
  • Active US TS/SCI security clearance with CI polygraph or higher

  • 5+ years of experience in IT operations or network/system administration

We are an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.

Top Skills

Aws Govcloud
Azure Government
C2E
C2S
Docker
Elastic Stack
Fluentd
Flux
Grafana
Helm
JIRA
Jwcc
Kubernetes
Osticket
Prometheus
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Columbia, MD
79 Employees
Year Founded: 2011

What We Do

Clarity Innovations designs, develops, and deploys force-enhancing software that links human ingenuity to powerful computing and makes our country and our world a better, safer place. Our principles of servant leadership, mutual respect, and technological excellence enable us to thrive on the challenges of the digital frontline.

Clarity is focused on helping the Government redefine its relationship with technology by encouraging the use of DevSecOps and Agile methodologies, small-teams constructs, and process automation. To create an atmosphere that embraces the challenges inherent in progress, we honor creativity and curiosity, supporting exploration and growth with modern tech stacks and methodologies.

We are as much a family as we are a team, collectively engineering technology that improves the lives and work of the people who use it. We empower our people with the tools, training, and vision they need to outperform the competition.

Similar Jobs

Wells Fargo Logo Wells Fargo

Teller DC-Bethesda-Potomac Area

Fintech • Financial Services
Hybrid
3 Locations
213000 Employees
24-31
Hybrid
3 Locations
213000 Employees
24-31
Hybrid
3 Locations
213000 Employees
26-34
Hybrid
3 Locations
213000 Employees
26-34

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account