About us:
Amach is an industry-leading technology driven company with headquarters located in Dublin and remote teams in UK and Europe.
Our blended teams of local and nearshore talent are optimised to deliver high quality and collaborative solutions.
Established in 2013, we specialise in cloud migration and development, digital transformation including agile software development, DevOps, automation, data and machine learning…
This role is focused on the development and maintenance of the Observability Platform, with secondary responsibilities in Site Reliability Engineering (SRE). It blends software engineering and systems administration to drive performance, scalability and reliability across a growing cloud estate. The successful candidate will support the design, deployment and optimisation of observability tooling, enable operational excellence through automation, and contribute to the broader reliability strategy.
You will report to the SRE Lead, who will support your success through appropriate task allocation and opportunities for technical growth. The role sits within a team responsible for both observability and reliability, providing operational support, driving continuous improvement, and ensuring system integrity across the production environment.
Required skills:
- 5+ years’ experience in cloud infrastructure or application development, with at least 2 years in Site Reliability Engineering roles
- Strong background in distributed systems, cloud-native architectures, and production-grade operational support
- Hands-on expertise with observability and monitoring tools such as Prometheus, Grafana, Loki, Tempo, ELK stack, OpenTelemetry, Datadog, or Splunk
- Proficient in Infrastructure-as-Code and automation using Terraform, Ansible, and Helm
- Skilled in containerisation and orchestration technologies, especially Docker and Kubernetes
- Solid experience with CI/CD pipelines (e.g., Jenkins, GitLab, GitHub Actions) and scripting/programming in Python, Go, or Bash
- Working knowledge of AWS services (EKS, EC2, S3, ASG, Load Balancers, IAM, etc.) and core networking principles
- Familiarity with incident management and alerting tools such as Alertmanager, OpsGenie, and CloudWatch
- Comfortable with modern data storage systems (MySQL, PostgreSQL, MongoDB) and performance testing tools (Chaos Monkey, k6)
- Strong problem-solving mindset with the ability to automate workflows, reduce manual effort, and proactively troubleshoot issues
- Demonstrated ability to collaborate across teams, communicate technical concepts clearly, and balance technical priorities with business needs
- High degree of ownership, accountability, and a customer-centric approach to system reliability and usability
Key responsibilities & duties include:
- Design, implement and maintain observability systems (metrics, logs, traces, alerting)
- Develop tooling and automation to enhance monitoring and incident management
- Partner with developers and infrastructure teams to define and track SLOs/SLAs/SLIs
- Improve system reliability and performance through data-driven insights
- Build self-service tooling and dashboards for engineering teams
- Participate in on-call rotation and incident response activities
- Optimise on-call workflows through automation and tooling
- Provide operational support and troubleshoot distributed software systems
- Collaborate on reliable and secure code deployment processes
- Analyse infrastructure data to fine-tune performance and availability
- Support CI/CD initiatives and delivery pipelines
- Champion continuous improvement and DevOps best practices
- Ensure capacity, compliance, continuity and service governance standards are met
- Document processes, systems and observability best practices
- Use trend analysis to identify potential system or process bottlenecks
- Provide regular reports on production system health and performance
Desirable skills & competencies:
- Fast learner with the ability to upskill independently
- Knowledge of SaaS, IaaS, and PaaS models
- In-depth experience with enterprise-grade monitoring tools
- Familiarity with major CI/CD strategies at scale
- Proven track record in implementing automation technologies
- Exposure to large-scale incident management strategies and tooling
What’s in it for you:
- An opportunity to join a fast-growing company
- Options for career advancement
- Learning and development opportunities
- Flexible working environment
- Competitive salaries based on experience
Equal Opportunity Employer:
Amach is an equal opportunity employer and makes employment decisions on the basis of merit. We celebrate diversity and are committed to creating an inclusive environment for all employees. This job description is intended to convey essential responsibilities and qualifications for this role, but it is not an exhaustive list of tasks that an employee may be required to perform.
If you are passionate about driving customer success, advising on strategic solutions, and contributing to product innovation, we would love to hear from you!
Not for you?
Check out all of our open positions in our careers page and follow us on LinkedIn for future opportunities.
P.S. Share this with friends and co-workers! Don't be afraid they'll steal it from you, if you're amazing and smart we'll find a role for you. We are growing fast and we are always looking for talented people.
At Amach, we strive to be an inclusive community of open-minded individuals with different backgrounds and we are committed to fostering, cultivating and preserving a culture of diversity, equity and inclusion. We strongly believe that a diversity of experience and background is essential to create a fulfilling environment and better solutions for our people and our customers. All Amach employees and contractors are expected to honour this policy and act to ensure that every individual is respected in the workplace.
Your personal data
Amach will process your personal information in accordance with the EU's General Data Protection Regulation (GDPR). We will comply with data protection law and principles, which means that your data will be:
- Used lawfully, fairly and in a transparent way
- Collected only for valid purposes and not used in any way that is incompatible with those purposes
- Relevant to the purposes we have told you about and limited only to those purposes
- Accurate and kept up to date
- Kept only as long as necessary for the purposes we have told you about
- Kept securely
If you would like to contact us about your data, please use the following address: [email protected]
Similar Jobs
What We Do
We help mature organisations evolve into modern digital businesses with faster time to market, increased operational stability and security. We offer a suite of technical services delivered by an experienced team of subject matter experts. Our services can be provided as a fully managed service or as an embedded part of your team. We focus on both short-term and long-term goals that emphasise business outcomes for our customers. Why choose Amach: 1. Business Agility - We focus on removing your IT debt, so that your company can focus on delivering business value to your customers at pace in a secure and reliable manner. 2. Cost Reduction - We help reduce your IT costs across all domains. This includes operational costs in both cloud and your data centres, licencing, evergreening, and reducing project delivery timelines. 3. Operational Stability - We will modernise your IT systems ensuring operational stability and resilience, to ensure the end user experience for both customers and staff are enhanced. 4. Enable Innovation - By removing IT debt, we create space for your team to focus on innovation, ensuring business longevity and futureproofing. We bring both sector experience and lessons learned. 5. Security - We address the overall architecture to ensure security by convention vs configuration. We promote DevSecOps practices, so security isn’t seen to prevent teams from delivery in an efficient manner. 6. Improve Employee Experience - We create and implement your EUX strategy that adapts to a changing world, while improving security and user efficiency - resulting in improved colleague satisfaction and retention. 7. Sustainability - Achieve your sustainability goals through optimising your overall IT footprint and reducing costs. We thrive on delivering customer value in every interaction. If you are interested in hearing more or to see how we could help you, please get in touch [email protected]







