toters delivery

Site Reliability Engineer

Posted 4 Days Ago

Be an Early Applicant

Metán, Salta

In-Office

Mid level

eCommerce • Food • Retail

The Role

The Site Reliability Engineer will ensure system reliability and performance, manage incidents, design monitoring systems, and automate processes in a cloud environment.

Summary Generated by Built In

The Company

Toters is an on-demand e-commerce and delivery platform and operates a service that enables customers to get anything in their city at the highest level of convenience.

At Toters, technology is at the heart of everything we do. We have product teams that are working hard every day to create products that make our customers' lives easier. Our engineers are also continuously creating solutions to make our processes more efficient, all in an effort to get to our customers fast and at the best cost. If you are interested in working in a high growth startup environment, and look to be part of a team that will potentially change the way customers shop in the Middle East, apply now.

About the Role

We are looking for a Mid-Level Site Reliability Engineer who will play a critical role in ensuring high availability, performance, and resilience across our production systems. You will be at the heart of operational excellence, leading high-impact incident responses, building proactive monitoring systems, and engineering automation that prevents outages before they happen. If you love solving complex distributed system challenges and thrive in high-pressure environments, this role is for you.

Key Responsibilities

Incident Management & Reliability

Act as Incident Commander during major outages, leading real-time diagnosis, communication, and recovery.
Own and improve the end-to-end incident management lifecycle, including post-incident reviews and action plans.
Drive root cause analysis and proactive reliability improvements to prevent recurrence.

Monitoring & Observability

Design and maintain metrics, alerts, and dashboards using Prometheus, Grafana, and New Relic.
Implement SLIs/SLOs to monitor service health and drive availability targets (99.99%+ uptime).
Integrate log management and distributed tracing with tools like ELK Stack and AWS X-Ray.

Automation & Tooling

Develop automation scripts and internal tooling in Python or Node.js to reduce manual ops and accelerate recovery (MTTR improvement).
Build self-healing infrastructure using IaC and automation pipelines.
Optimize on-call workflows, escalation policies, and runbooks using PagerDuty.

Cloud Infrastructure

Operate and improve infrastructure hosted on AWS, ensuring reliability, cost efficiency, and scalability.
Collaborate with backend and platform teams to embed SRE best practices across engineering.

Key Qualifications

2–4 years of experience in Site Reliability Engineering, DevOps, or Platform Engineering.
Proven success managing production incidents and participating in on-call rotations.
Strong hands-on experience with Prometheus, Grafana, and PagerDuty.
Proficient in Python or Node.js for automation and tooling.
Experience with AWS services (EC2, CloudWatch, ECS/Lambda, IAM, etc.).
Solid understanding of Linux systems, networking, and CI/CD pipelines.

Nice to Have

Experience as Incident Commander in mission-critical environments.
Knowledge of New Relic, Sentry, ELK Stack, or Datadog.
Background implementing SLIs/SLOs/Error Budgets (Google SRE model).
Familiarity with Docker, Kubernetes, Terraform, or Ansible.
Certifications such as:
- AWS Solutions Architect Associate/DevOps Engineer
- ITIL Foundation or relevant reliability certifications.

Top Skills

Ansible

AWS

Aws X-Ray

Ci/Cd

Docker

Elk Stack

Grafana

Kubernetes

Node.js

Pagerduty

Prometheus

Python

Terraform

View all jobs at toters delivery

View toters delivery Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Beirut

744 Employees

Year Founded: 2015

What We Do

Enabling last-mile same day delivery of any local product near you. Available for iPhone and Android, the Toters service connects customers with retailers, local couriers, who purchase and deliver goods from any grocery store, restaurant, or other retail shop in your city.

Download the Toters app today or visit www.totersapp.com