Site Reliability Engineer

Posted Yesterday
Be an Early Applicant
London, Greater London, England, GBR
Hybrid
Mid level
Artificial Intelligence • Computer Vision • Machine Learning • Security
The Role
Design, build, and maintain scalable, secure infrastructure; automate deployments and monitoring; define SLOs/SLIs; participate in on-call incident response and RCA; write infrastructure code; plan capacity; and document systems and processes.
Summary Generated by Built In

About Augur

Augur transforms legacy sensor systems into real-time operational intelligence. We help organisations act before threats escalate – not after damage is done. Everything we build is privacy-first by design.

Our platform connects to existing cameras and sensors to detect threats, retrace events, and surface real-time insights – all without replacing a single device. Augur helps teams reduce risk, cut costs, and grow revenue through better visibility, faster decisions, and fewer blind spots.

Our culture is built on a foundation of radical candor and mutual trust. We recognise that being an industry leader requires more than just building exceptional products – it requires empowering a team of exceptional people. Individually, we operate with high autonomy; collectively, we’re unified by a shared drive to achieve our mission through a commitment to excellence.

The Role

As a Site Reliability Engineer at Augur, you will be responsible for the availability, performance, and scalability of our critical systems. You will bridge the gap between software development and IT operations by applying a software engineering mindset to system administration.

  • System Architecture and Design: Design, build, and maintain the core infrastructure that runs our platforms. Develop scalable, secure and highly reliable systems to support our services.

  • Automation and Tooling: Automate manual operational tasks, including deployment, monitoring, and incident response. Develop and maintain tools to improve efficiency and reduce the potential for human error.

  • Performance and Reliability: Proactively monitor system performance, identify bottlenecks, and implement solutions to ensure high availability and low latency. Define and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs).

  • Incident Management: Participate in an on-call rotation to respond to production incidents. Lead incident response, conduct root cause analysis (RCA), and implement preventative measures to avoid future occurrences.

  • Software Development: Write and review code for infrastructure automation, monitoring tools, and system improvements. Contribute to the development lifecycle to ensure reliability is built into our products from the start.

  • Capacity Planning: Forecast future infrastructure needs and scale our systems accordingly. Analyse system usage and performance data to make informed decisions about resource allocation.

  • Collaboration and Best Practices: Work closely with software development teams to promote reliability best practices. Champion a culture of reliability across the engineering organisation.

  • Documentation and Knowledge Sharing: Maintain comprehensive, up-to-date documentation for all systems, processes, and operational procedures as a core part of everyday work. Treat documentation as a first-class engineering deliverable.

Accessibility and inclusivity

At Augur, our culture is built on radical candor and mutual trust. To solve the complex, physical-world problems our customers face, we need a team that thinks differently.

Whether you’re self-taught, have a non-traditional background, or are returning to the workforce, if you have the grit and the experience we’re looking for at Augur, we’d love to hear from you.

We are committed to fostering a diverse workplace and ensuring an accessible hiring experience - please let us know if you require any accommodations during the interview process to help you do your best work.

Skills Required

  • Design, build, and maintain core infrastructure for platform availability and scalability
  • Automate operational tasks including deployment, monitoring, and incident response
  • Proactively monitor performance, identify bottlenecks, and manage SLOs/SLIs
  • Participate in on-call rotation and lead incident response and root cause analysis
  • Write and review code for infrastructure automation, monitoring tools, and system improvements
  • Capacity planning and forecasting for infrastructure needs based on usage and performance data
  • Collaborate with software teams to promote reliability best practices
  • Maintain comprehensive, up-to-date documentation for systems, processes, and operational procedures
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Year Founded: 2024

What We Do

Augur Initiative Ltd is a sensor technology company that transforms legacy sensor systems into real-time operational intelligence. It provides sensor fusion and machine learning solutions for security applications, enabling organizations to detect threats and surface real-time insights through a privacy-first platform that connects to existing cameras and sensors without requiring device replacement.

Similar Jobs

NBCUniversal Logo NBCUniversal

Site Reliability Engineer

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Remote or Hybrid
London, Greater London, England, GBR
68000 Employees

NBCUniversal Logo NBCUniversal

Site Reliability Engineer

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Remote or Hybrid
London, Greater London, England, GBR
68000 Employees

Axon Logo Axon

Site Reliability Engineer

Artificial Intelligence • Cloud • Social Impact • Software • Wearables
In-Office
London, Greater London, England, GBR
2700 Employees
In-Office
London, Greater London, England, GBR
440 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account