Site Reliability Engineer

Posted Yesterday
Easy Apply
Be an Early Applicant
Frederick, MD, USA
In-Office
Senior level
Information Technology • Consulting
The Role
The Site Reliability Engineer designs monitoring frameworks, manages SLIs/SLOs, automates cloud infrastructure, ensures compliance, and promotes SRE best practices.
Summary Generated by Built In
(ID: 2025-1135)

Axle is a bioscience and information technology company that offers advancements in translational research, biomedical informatics, and data science applications to research centers and healthcare organizations nationally and abroad. With experts in biomedical science, software engineering, and program management, we focus on developing and applying research tools and techniques to empower decision-making and accelerate research discoveries. We work with some of the top research organizations and facilities in the country including multiple institutes at the National Institutes of Health (NIH).


Benefits We Offer:

  • 100% Medical, Dental & Vision Coverage for Employees
  • Paid Time Off and Paid Holidays
  • 401K match up to 5%
  • Educational Benefits for Career Growth
  • Employee Referral Bonus
  • Flexible Spending Accounts:
    • Healthcare (FSA)
    • Parking Reimbursement Account (PRK)
    • Dependent Care Assistant Program (DCAP)
    • Transportation Reimbursement Account (TRN)

Responsibilities:

  • Design and implement enterprise-grade monitoring and observability frameworks (metrics, logs, traces) across distributed systems using enterprise Splunk, Grafana and Open-telemetry tools

  • Establish and manage SLIs, SLOs, and error budgets to drive reliability improvements 

  • Develop and maintain real-time asset inventory systems across cloud, on-prem, and hybrid environments 

  • Automate workload onboarding and offboarding processes, ensuring standardization and governance 

  • Track system ownership, dependencies, and lifecycle states for operational transparency

  • Build proactive detection mechanisms using AIOps and intelligent alerting to minimize incident impact

  • Design and operate scalable, resilient, and secure infrastructure platforms across cloud and hybrid environments 

  • Implement automated compliance tracking and enforcement aligned with organizational and regulatory standards (e.g., NIST, FISMA, FedRAMP) 

  • Embed ITIL processes (incident, change, problem, configuration management) into SRE workflows 

  • Build and maintain automated deployment environments and pipelines that enforce security, compliance, and operational standards 

  • Develop “golden paths” and standardized platform templates for consistent workload deployment 

  • Automate provisioning, patching, configuration management, and environment lifecycle 

  • Leverage AI/ML coding assistants and vibe coding practices to rapidly develop automation scripts, tools, and internal platforms 

  • Integrate AI-driven tooling into DevOps pipelines for code quality, security scanning, and operational insights 

  • Lead adoption of AI-enhanced SRE practices, including intelligent remediation and predictive operations

  • Champion DevOps and SRE practices including Infrastructure as Code, CI/CD, observability, and reliability engineering 

  • Build developer-friendly platforms (“golden paths”) that simplify deployments, reduce friction, and improve velocity 

  • Enable and optimize infrastructure for AI/ML workloads, including data pipelines, storage systems, and inference environments, GPU-enabled and high-performance compute workloads 

  • Build and manage containerized and orchestrated platforms (Docker, Kubernetes) 

  • Support cloud migration, modernization, and platform standardization initiatives 

  • Ensure systems meet security, compliance, backup, and disaster recovery requirements 

  • Evangelize and promote best practices in DevOps, SRE, and platform engineering to developer communities

  • Stay abreast of new technologies in your areas but not limited to AIOps, MLOps, cloud computing & deployment, site reliability engineering, infrastructure automation, security best practices, data engineering etc. 

     

Requirements:

  • Must have total of 6+ experience DevOps / SRE roles with monitoring and observability tools (Prometheus, Grafana, ELK, or cloud-native equivalents) for on-prem and cloud hosted workloads.

  • Must have 4+ years of Hands-on Linux experience that includes Ubuntu/CentOS/Red Hat operating systems, containers, dependency management and administration support

  • Must have 4+ years of experience automating Infrastructure-as-Code (IaC) deployments to one of the following cloud platforms Amazon AWS, Google GCP and Microsoft Azure

  • Must have 4+ years with CI/CD and automation tools such as Terraform, Ansible, Chef, Puppet, Jenkins, GitHub Actions

  • Strong scripting skills (Python, Bash, PowerShell or similar)

  • Must be proficient using vibe coding and coding assistants to develop scripts, tools and applications for the DevOps and SRE use cases

  • Must have proficiency to debug or troubleshoot and/or deploying SQL and/or NoSQL databases, object storage, web servers, open-source programming stack for Node.JS, R, Python, .NET Core, Java is desired but not mandatory

  • Must be willing to learn new technologies, adopt and adapt to emerging technologies or needs from a project to a project

  • Cloud certifications is preferred 

  • Certifications in Grafana, Splunk, Docker, Kubernetes is preferred but optional

     


Disclaimer: The above description is meant to illustrate the general nature of work and level of effort being performed by individuals assigned to this position or job description. This is not restricted as a complete list of all skills, responsibilities, duties, and/or assignments required. Individuals may be required to perform duties outside of their position, job description or responsibilities as needed.

The diversity of Axle’s employees is a tremendous asset. We are firmly committed to providing equal opportunity in all aspects of employment and will not tolerate any illegal discrimination or harassment based on age, race, gender, religion, national origin, disability, marital status, covered veteran status, sexual orientation, status with respect to public assistance, and other characteristics protected under state, federal, or local law and to deter those who aid, abet, or induce discrimination or coerce others to discriminate.

Accessibility: If you need an accommodation as part of the employment process please contact: [email protected]

This role has a market-competitive salary with an anticipated base compensation range listed below. Actual salaries will vary depending on a candidate’s experience, qualifications, skills, and location.

Top Skills

.Net Core
Amazon Aws
Ansible
Bash
Chef
Docker
Elk
Github Actions
Google Gcp
Grafana
Java
Jenkins
Kubernetes
Azure
Node.js
Open-Telemetry
Powershell
Prometheus
Puppet
Python
R
Splunk
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Rockville, MD
191 Employees
Year Founded: 2002

What We Do

Axle Informatics is a bioscience and information technology company that offers advancements in translational research, health informatics, and data science applications to research centers and healthcare organizations around the globe. With experts in biomedical science, software engineering, and program management, we develop and apply research tools and techniques that empower decision-making and accelerate research discoveries. We work with some of the top research organizations and facilities in the country including multiple institutes at the National Institutes of Health (NIH) by offering the responsiveness of a small business coupled with the experience, breadth, and depth of a large organization.

Similar Jobs

MongoDB Logo MongoDB

Site Reliability Engineer

Big Data • Cloud • Software • Database
Easy Apply
Remote or Hybrid
4 Locations
5550 Employees
127K-249K Annually

Jellyfish Logo Jellyfish

Site Reliability Engineer

Big Data • Cloud • Productivity • Software • Database • Analytics • Automation
Remote or Hybrid
United States
225 Employees
165K-235K Annually

Milestone Systems Logo Milestone Systems

Site Reliability Engineer

Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics
Remote or Hybrid
United States
1500 Employees
160K-180K Annually

Ping Identity Logo Ping Identity

Site Reliability Engineer

Cloud • Security • Software
Easy Apply
Remote or Hybrid
USA
2300 Employees
96K-120K Annually

Similar Companies Hiring

Scrunch  Thumbnail
Artificial Intelligence • Information Technology • Marketing Tech • Software • SEO
Salt Lake City, Utah
Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account