RC Talent Solutions

Graphite - Site Reliability Engineer (SRE)

Reposted 16 Days Ago

Be an Early Applicant

Hiring Remotely in Guadalajara, Jalisco, MEX

Remote or Hybrid

50K-60K Annually

Mid level

Information Technology • Professional Services

The Role

The Site Reliability Engineer will ensure service stability, implement monitoring systems, manage cloud infrastructure, automate workflows, and respond to incidents.

Summary Generated by Built In

Site Reliability Engineer (SRE)Overview

We're looking for a passionate and hands-on Site Reliability Engineer (SRE) to join our team. This role is critical for ensuring the stability, performance, and scalability of our production services. You'll be the bridge between development and operations, with a strong focus on using code to manage infrastructure and eliminate toil.

Key Responsibilities

Monitoring and Alerting: Design, implement, and maintain robust monitoring and alerting systems (e.g., GCP Monitoring, Prometheus, Grafana, Traces, Logs) to provide visibility into application performance and infrastructure health.
Infrastructure Management: Build, provision, and maintain our core infrastructure, with a strong emphasis on Cloud environments and Kubernetes clusters.
Automation and Tooling: Write and maintain scripts and automation workflows (e.g., Python, Bash, TypeScript (Pulumi)) to streamline deployment, scaling, and operational tasks, embracing the philosophy of "automating everything."
Incident Response: Provide hands-on, real-time incident response and participate in an on-call rotation to quickly mitigate service disruptions and restore functionality.
Production Debugging: Deeply debug and troubleshoot complex production problems across the entire stack, from network issues to application code defects.
Process Improvement: Conduct blameless post-mortems for major incidents, implementing long-term solutions to prevent recurrence and continuously improve service reliability.

Qualifications

Proven experience as an SRE, DevOps Engineer, or similar role.
Expertise in managing and scaling Kubernetes in a production environment.
Strong proficiency in a scripting or programming language (e.g., Python, Go, Bash).
Deep understanding of monitoring, logging, and alerting best practices.
Solid experience with at least one major Cloud provider (AWS, GCP, or Azure).
Experience with Infrastructure as Code (IaC) tools like Terraform or Pulumi is a plus.

What You'll Bring

A proactive, data-driven approach to reliability and a passion for managing complex systems at scale.

Compensation

The base pay range for this role is $50,000 – $60,000 per year.

Skills Required

Proven experience as an SRE or DevOps Engineer
Expertise in managing and scaling Kubernetes in production
Strong proficiency in Python or similar scripting languages
Deep understanding of monitoring and alerting best practices
Experience with major Cloud providers (AWS, GCP, or Azure)

View all jobs at RC Talent Solutions

View RC Talent Solutions Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

10 Employees

Year Founded: 2022

What We Do

RC Talent Solutions is a premier technology recruiting partner specializing in personalized tech staffing for IT roles, helping startups, enterprise teams, and global brands build world-class tech.