We are looking for an engineering leader to run the Federal Operations team. You will stabilize and mature the platform—owning service reliability, release safety, and continuous compliance—so we can confidently expand customer coverage and delivery velocity. You’ll manage and grow a high-performing team, build strong cross-functional muscle, and deliver predictable outcomes in a regulated environment.
What You’ll Do- Lead and grow Gov SRE to operate a secure, reliable, and scalable government environment (people leadership, hiring, coaching, performance, and culture).
- Enforce SLOs/SLIs, incident response, on-call, and change management processes aligned to internal risk thresholds.
- Drive platform stabilization: reduce toil, harden baselines, improve observability, and shrink MTTR through runbooks, automation, and quality guardrails.
- Own safe delivery to the Gov environment: plan and orchestrate releases, change reviews, and rollbacks; improve CI/CD and IaC workflows for repeatable, auditable change.
- Build and prioritize an execution roadmap for Gov v2.0 launch (scale product enablement in Gov, reduce operational drag, and improve deployment lead time without increasing risk).
- Improve cost, performance, and resiliency posture in GovCloud through architecture reviews, reliability testing, capacity planning.
- Report clear metrics, risks, and progress to stakeholders; proactively escalate blockers and propose mitigations.
- 7+ years in SRE/Infrastructure/Platform Engineering operating customer-facing services at scale.
- Hands-on leadership in incident management, SLOs/SLIs, observability, and change/release management.
- 2+ years managing SRE/Infra/FedOps teams with on-call ownership, including hands-on leadership in incident management, SLOs/SLIs, observability, and change/release management.
- Practical experience with cloud infrastructure (AWS preferred), Kubernetes/containers, Terraform or similar IaC, and modern CI/CD.
- Strong cross-functional collaboration with Security/Compliance, Product & Eng, and GTM; excellent written/runbook documentation and stakeholder communication.
- Track record of automation that reduces toil and improves reliability, auditability, and developer productivity.
- Ability to set clear goals/metrics, manage a prioritized roadmap, and deliver outcomes through the team.
- Depth in incident tooling and telemetry (e.g., metrics, tracing, logging) and alert hygiene.
- Experience operating in AWS GovCloud.
- Hiring and scaling a small team in a fast-moving, high-accountability environment.
- Product onboarding to Gov, dependency readiness, release planning, and quality gates.
- Partner across teams in platform infrastructure: shared patterns/modules, CI/CD safety, IaC standards, and golden-path delivery.
- Customer-facing teams: deployment readiness for POVs and go-live, incident comms, and reliability posture for federal customers.
#LI-ML1
At Abnormal AI, certain roles are eligible for a bonus, restricted stock units (RSUs), and benefits. Individual compensation packages are based on factors unique to each candidate, including their skills, experience, qualifications and other job-related reasons.
Abnormal AI is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, protected veteran status or other characteristics protected by law. For our EEO policy statement please click here. If you would like more information on your EEO rights under the law, please click here.
Top Skills
What We Do
The Abnormal Security platform protects enterprises from targeted email attacks. Abnormal Behavior Technology (ABX) models the identity of both employees and external senders, profiles relationships and analyzes email content to stop attacks that lead to account takeover, financial damage and organizational mistrust. Though one-click, API-based Office 365 and G Suite integration, Abnormal sets up in minutes and does not disrupt email flow.
Abnormal Security was founded in 2018 by CEO Evan Reiser, CTO Sanjay Jeyakumar, Head of Machine Learning Jeshua Bratman, and Founding Engineers Abhijit Bagri and Dmitry Chechik. The team previously built behavioral profiling and machine learning technologies at Twitter, Google and Pinterest that are being applied to solve a problem that costs organizations $1 billion per year, according to the FBI. The Abnormal Security platform stops targeted phishing, business email compromise and account takeover attacks that have never been seen before.









