Augur Initiative Ltd

Site Reliability Engineer

Posted Yesterday

Be an Early Applicant

London, Greater London, England, GBR

Hybrid

Mid level

Artificial Intelligence • Computer Vision • Machine Learning • Security

The Role

Design, build, and maintain scalable, secure infrastructure; automate deployments and monitoring; define SLOs/SLIs; participate in on-call incident response and RCA; write infrastructure code; plan capacity; and document systems and processes.

Summary Generated by Built In

About Augur

Augur transforms legacy sensor systems into real-time operational intelligence. We help organisations act before threats escalate – not after damage is done. Everything we build is privacy-first by design.

Our platform connects to existing cameras and sensors to detect threats, retrace events, and surface real-time insights – all without replacing a single device. Augur helps teams reduce risk, cut costs, and grow revenue through better visibility, faster decisions, and fewer blind spots.

Our culture is built on a foundation of radical candor and mutual trust. We recognise that being an industry leader requires more than just building exceptional products – it requires empowering a team of exceptional people. Individually, we operate with high autonomy; collectively, we’re unified by a shared drive to achieve our mission through a commitment to excellence.

The Role

As a Site Reliability Engineer at Augur, you will be responsible for the availability, performance, and scalability of our critical systems. You will bridge the gap between software development and IT operations by applying a software engineering mindset to system administration.

System Architecture and Design: Design, build, and maintain the core infrastructure that runs our platforms. Develop scalable, secure and highly reliable systems to support our services.
Automation and Tooling: Automate manual operational tasks, including deployment, monitoring, and incident response. Develop and maintain tools to improve efficiency and reduce the potential for human error.
Performance and Reliability: Proactively monitor system performance, identify bottlenecks, and implement solutions to ensure high availability and low latency. Define and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
Incident Management: Participate in an on-call rotation to respond to production incidents. Lead incident response, conduct root cause analysis (RCA), and implement preventative measures to avoid future occurrences.
Software Development: Write and review code for infrastructure automation, monitoring tools, and system improvements. Contribute to the development lifecycle to ensure reliability is built into our products from the start.
Capacity Planning: Forecast future infrastructure needs and scale our systems accordingly. Analyse system usage and performance data to make informed decisions about resource allocation.
Collaboration and Best Practices: Work closely with software development teams to promote reliability best practices. Champion a culture of reliability across the engineering organisation.
Documentation and Knowledge Sharing: Maintain comprehensive, up-to-date documentation for all systems, processes, and operational procedures as a core part of everyday work. Treat documentation as a first-class engineering deliverable.

Accessibility and inclusivity

At Augur, our culture is built on radical candor and mutual trust. To solve the complex, physical-world problems our customers face, we need a team that thinks differently.

Whether you’re self-taught, have a non-traditional background, or are returning to the workforce, if you have the grit and the experience we’re looking for at Augur, we’d love to hear from you.

We are committed to fostering a diverse workplace and ensuring an accessible hiring experience - please let us know if you require any accommodations during the interview process to help you do your best work.

Skills Required

Design, build, and maintain core infrastructure for platform availability and scalability
Automate operational tasks including deployment, monitoring, and incident response
Proactively monitor performance, identify bottlenecks, and manage SLOs/SLIs
Participate in on-call rotation and lead incident response and root cause analysis
Write and review code for infrastructure automation, monitoring tools, and system improvements
Capacity planning and forecasting for infrastructure needs based on usage and performance data
Collaborate with software teams to promote reliability best practices
Maintain comprehensive, up-to-date documentation for systems, processes, and operational procedures

View all jobs at Augur Initiative Ltd

View Augur Initiative Ltd Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

Year Founded: 2024

What We Do

Augur Initiative Ltd is a sensor technology company that transforms legacy sensor systems into real-time operational intelligence. It provides sensor fusion and machine learning solutions for security applications, enabling organizations to detect threats and surface real-time insights through a privacy-first platform that connects to existing cameras and sensors without requiring device replacement.