About the role
The Site Reliability Engineering (SRE) team at ESO is a high-impact group responsible for ensuring the reliability, scalability, and performance of our production systems. We operate at the intersection of engineering and operations, with a strong focus on automation, resilience engineering, and continuous improvement.
As a Senior Site Reliability Engineer, you will take a leadership role in designing and implementing scalable, observable, and fault-tolerant systems. You’ll oversee deployments across pre-production and production environments, drive automation of operational workflows, and lead incident response and postmortem analysis. You will collaborate closely with engineering and infrastructure teams to proactively identify risks, optimize system health, and uphold service-level objectives.
In this role, you'll also help shape SRE best practices across the organization—introducing new technologies, mentoring engineers, and enhancing our tooling ecosystem to ensure operational excellence. Your work will directly impact on the stability of ESO’s platform, and the quality of service delivered to our customers.
More about you
You are an ideal fit for this role if you are a proactive, self-directed engineer with a deep sense of ownership and a commitment to operational excellence. You bring both curiosity and discipline to your work—balancing thoughtful experimentation with precision and rigor. As a senior technologist, you take pride in building resilient systems, mentoring others, and contributing to a culture of reliability, accountability, and continuous improvement.
Your qualifications
Essential Qualifications:
- Proven experience designing and managing CI/CD pipelines using tools such as Azure DevOps, Jenkins, Bamboo, or Octopus, with a strong emphasis on reliability, automation, and release velocity.
- Familiarity with enterprise collaboration and ITSM tools such as JIRA, Confluence, or Salesforce.
- Advanced knowledge of Windows Server administration and Microsoft Azure services, including compute, networking, identity, serverless and monitoring components.
- Proficiency in scripting and automation, preferably in PowerShell and C#/.NET, with an ability to design maintainable, scalable solutions.
- Strong ability to read and understand code written in C# and .NET, including the capability to analyze and interpret memory dumps for troubleshooting and debugging.
- Skilled in writing and troubleshooting SQL queries to identify and resolve performance bottlenecks and data-related issues.
- Strong background in observability engineering, with hands-on experience using platforms like New Relic and Pager Duty to implement monitoring, alerting, and performance analysis at scale.
- Demonstrated ability to troubleshoot and debug complex, cloud-native applications with a focus on resiliency, system health, and performance optimization.
- Deep understanding of application monitoring concepts such as Open Telemetry, sampling, logs, and traces, with the ability to effectively navigate observability data to identify root causes.
- Excellent written and verbal communication skills, capable of articulating technical concepts to both engineering and non-technical stakeholders.
- A continuous learner with a strong sense of ownership, passionate about operational excellence, reliability engineering, and enabling developer productivity.
Preferred Qualifications:
- Experience working with Linux VMs and administering cloud environments.
- Experience with infrastructure-as-code tools, particularly Terraform.
- Proficient in Git and modern version control workflows.
Applicant Privacy Notice – please click here to review the applicant privacy notice which details how your data is collected, used and protected.
Similar Jobs
What We Do
ESO is a fast-paced, growing data, technology, and research company passionate about improving community health and safety through the power of data. We pioneer innovative, user-friendly software to meet the changing needs of today’s EMS agencies, fire departments, and hospitals. We’re small enough to be nimble and fun, but big enough to be a great place to work. We serve thousands of customers out of our four US offices and our Belfast, Northern Ireland office.
We believe in the power of data to improve community health and safety. That’s not just some lofty corporate vision statement — it’s something we live, breathe and see the results of every day. We approach our work as if the lives of our own families and friends depended on the results. Because a lot of the time … they do.
Why Work With Us
We believe in taking great care of our customers and our employees. We believe work ought to be both challenging and fun. (Otherwise what’s the point?) We believe it’s worthwhile to continually push for something better, to pursue excellence for the sake of excellence, and to hold each other to the same standard.
Gallery







