Senior Site Reliability Engineer

Reposted 16 Days Ago
Be an Early Applicant
Dublin
In-Office
Senior level
Insurance • Software
The Role
As a Senior Azure Site Reliability Engineer, you'll ensure the availability and performance of a SaaS platform on Azure through automation, monitoring, incident response, and continuous improvement.
Summary Generated by Built In

As an Senior Azure Site Reliability Engineer (SRE), you will play a critical role in ensuring the reliability, availability, and performance of our Vew SaaS platform hosted on Microsoft Azure. You will collaborate closely with development, operations, and infrastructure teams to design, implement, and maintain highly scalable and resilient systems. Your primary focus will be on automation, monitoring, incident response, and continuous improvement to enhance the overall reliability of our services.

 

Responsibilities:

System Reliability:

  • Implement and maintain highly available, scalable, and fault-tolerant systems on Azure.
  • Monitor system health and performance metrics to ensure reliability and proactively address issues.
  • Maintain a set of metrics and reporting to demonstrate the operational performance of the Incident & Problem Management processes.

Automation:

  • Develop and maintain automation scripts and tools for provisioning, deployment, monitoring, and scaling of services.
  • Implement Infrastructure as Code (IaC) using tools like Azure Resource Manager templates to ensure consistent and reproducible environments.
  • Leverage AI-based automation to predict and prevent incidents before they impact customers.

Monitoring and Alerting:

  • Configure and maintain monitoring solutions to provide real-time visibility into system health and performance.
  • Define and implement alerting strategies to detect and respond to incidents in a timely manner.

Incident Response:

  • Respond to and resolve incidents, including root cause analysis, mitigation, and communication with stakeholders.
  • Develop and maintain incident response playbooks to streamline response processes.
  • Continue to develop robust Incident management processes that will enable effective management of our customers from an Incident & Problem perspective.
  • Ensure support issues are resolved within the contractual SLA’s.
  • Conduct post-incident reviews and implement recommendations to prevent recurrence.

Security and Compliance:

  • Ensure systems and infrastructure adhere to security best practices and compliance requirements.
  • Implement and maintain security controls, encryption, and access management mechanisms.

Continuous Improvement:

  • Identify areas for optimization and implement solutions to improve system reliability, performance, and efficiency.
  • Participate in regular reviews and retrospectives to drive continuous improvement in processes and systems.
  • Drive continuous service improvement to work towards achieving operational excellence.
  • Maintain up-to-date knowledge of the latest technologies and best practices in application support.
  • Engaging with Development and Quality Assurance Teams on Support issues.

 

Key Skills/ Qualifications:

  • Bachelor's degree in Computer Science, Engineering, or related field.
  • Proven experience as a Site Reliability Engineer or similar role, preferably in a SaaS environment.
  • Strong proficiency in Microsoft Azure services, including compute, networking, storage, and monitoring.
  • Experience with automation tools and scripting languages such as PowerShell
  • Solid understanding of containerization technologies (e.g., Docker, Kubernetes) and orchestration tools.
  • Work with DevOps team to Improve CI/CD pipeline for reliability and deployment and efficiency.
  • Experience with Bicep/Terraform and ARM templates for Infrastructure as Code (IaC).
  • Hands-on experience with monitoring and logging tools such as Azure Monitor, Grafana, Prometheus, or Datadog
  • Knowledge of security best practices, compliance standards (e.g., ISO27001, SOC 2, GDPR), and relevant regulations.
  • Excellent problem-solving skills and the ability to troubleshoot complex technical issues.
  • Strong communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.
  • Azure certifications such as Azure Administrator Associate or Azure Solutions Architect Expert are a "nice to have".

 

Who we are: 

DOCOsoft is a leading software and services provider to Lloyd’s of London and the broader London insurance market. It was founded in 2008 and has since grown to become one of the leading insurance software specialists in the London Insurance Market. We are a growing team of approximately 95 with offices in London, Dublin, Tokyo, Portugal and Poland.

DOCOsoft aspires to be a market leader in the technology sector, and we are always looking for new ways to approach projects or improve existing content. We look to hire people that will help us achieve this with hard work, enthusiasm and an and expression of their own ideas.

We offer our people:  

  • The opportunity to impact our growing business - make your own stamp on the role/ company.
  • Exciting challenges to grow.
  • Exciting challenges to grow – we motivate and mentor junior members of the team, many of whom joined as interns and have progressed through the organisation.
  • A competitive salary.
  • Company pension.
  • Health Insurance.
  • Remote and flexible working.
  • 25 days annual leave.

 

Equal Opportunity Employer  

DOCOsoft is committed to building an inclusive and diverse team that represents a variety of backgrounds, experiences and perspectives. We welcome applications from all suitably qualified candidates, and do not discriminate on the grounds of race, religion, gender, marital or family status, age, disability, sexual orientation, membership of the travelling community or any other basis as protected by applicable law. Should you require reasonable accommodations during any stage of the recruitment process, please let us know.

Top Skills

Arm Templates
Azure Monitor
Bicep
Datadog
Docker
Grafana
Kubernetes
Azure
Powershell
Prometheus
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Dublin, Dublin
69 Employees

What We Do

DOCOsoft is an innovative developer of technology solutions for the global insurance and financial services markets, with a long-standing record of success. Since the late 90s, DOCOsoft has developed specialist software, widely used in the London market and further afield, to serve clients and secure business continuity.

DOCOsoft's claims management system helps some of the world’s largest multi-national insurers and reinsurers maximise performance through business process automation, enabling teams to save time and get ahead. We bring clarity to a complicated marketplace through integrated workflow and data management solutions. DOCOsoft has built an excellent and rapidly growing reputation within the €70 billion London insurance Market.

DOCOsoft has operated independently to innovate freely and own intellectual property outright. This freedom ensures we're in complete control of our capabilities to propel firms into the future as things change in a fast-paced marketplace. All products marketed under the DOCOsoft brand are designed and developed in Ireland by DOCO System Solutions Ltd. DOCOsoft is privately owned and operates internationally from offices in Dublin, London, and Tokyo.

Similar Jobs

Udemy Logo Udemy

Senior Site Reliability Engineer

Artificial Intelligence • Consumer Web • Edtech • Enterprise Web • HR Tech • Social Impact • Generative AI
Easy Apply
Hybrid
Dublin, IRL
1500 Employees

MongoDB Logo MongoDB

Site Reliability Engineer

Big Data • Cloud • Software • Database
Easy Apply
Hybrid
2 Locations
5550 Employees
127K-249K Annually

Reddit Logo Reddit

Senior Site Reliability Engineer

Information Technology • Mobile • News + Entertainment • Social Media
Easy Apply
Hybrid
Dublin, IRL
1900 Employees

Grouper Technology Logo Grouper Technology

Senior Site Reliability Engineer

Information Technology • Software • Cybersecurity
In-Office or Remote
Dublin, IRL
9 Employees
5-5 Annually

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account