Sr Site Reliability Engineer

Sorry, this job was removed at 08:16 p.m. (CST) on Tuesday, Apr 29, 2025
Be an Early Applicant
Hiring Remotely in Brazil
Remote
Fintech • Payments • Software • Financial Services
The Role

Description

Come and impact millions of Brazilians!!

Want to make a difference in the lives of millions of Brazilians? At RecargaPay, we create accessible and innovative financial solutions that transform the way people interact with money. Be part of this impactful and innovative journey, connecting people with opportunities that truly make a difference in their daily lives.

Our purpose is to deliver the best mobile payment experience for Brazilians, addressing real-world challenges with smart solutions like Pix Parcelado, while staying attentive to market trends and our customers' needs. Here, we value collaboration, ownership, and a relentless pursuit of results, delivering excellence in every interaction.

If you’re looking to join a dynamic environment that challenges the status quo and puts people at the center of decision-making, RecargaPay is the perfect place for you to grow, co-create, and make a difference!

Responsibilities

We are looking for a Senior Site Reliability Engineer (SRE) to define and implement monitoring and observability standards, ensuring the reliability and efficiency of our environment. This professional will be responsible for analyzing metrics and alerts, anticipating failures, identifying infrastructure and application bottlenecks, and proposing architectural improvements to enhance efficiency and availability. They will also play a key role in post-mortems, sharing knowledge and contributing to effective action plans.

  • Define and enhance monitoring and observability standards;
  • Support the definition and monitoring of SLIs/SLOs and other key performance indicators to ensure alignment with reliability goals;
  • Analyze metrics and alerts to anticipate failures and optimize performance;
  • Identify bottlenecks and areas for improvement in infrastructure and applications;
  • Propose and implement software architecture and infrastructure improvements to increase efficiency and availability;
  • Lead and support post-mortems, promoting best practices and lessons learned;
  • Document best practices, incident learnings, and technical solutions to foster knowledge sharing and accelerate problem resolution;
  • Work in a GitOps environment, using GitHub Actions for automation;
  • Collaborate with development and infrastructure teams to ensure service resilience and scalability;
  • Conduct troubleshooting and performance optimization in containers and Kubernetes (EKS);
  • Serve as a technical reference for reliability, supporting the adoption of SRE practices across squads and contributing to the evolution of engineering culture;
  • Work alongside Security, Platform, and Data teams to ensure a holistic approach to reliability and scalability;
  • Demonstrate the ability to influence technical decisions and drive improvements, even in teams where they are not directly involved;
  • Maintain a mindset focused on continuous learning, resilience in handling incidents, and a strong emphasis on prevention and automation.
Requirements
  • Experience with monitoring and observability tools, including New Relic, Prometheus, and Grafana;
  • Proficiency in GitHub and GitOps practices with GitHub Actions;
  • Strong experience with AWS and infrastructure as code using Terraform and Terragrunt;
  • Experience with microservices architecture and Kubernetes;
  • Solid knowledge in SRE, Resilience, Performance, and Automation;
  • Hands-on experience with troubleshooting and performance tuning in complex environments;
  • Expertise in infrastructure and problem analysis in containers and Kubernetes (EKS);
  • Knowledge of languages such as Python, Ansible, and Shell Script (preferred);
  • Experience with distributed environments, high availability, and scalability;
  • Familiarity with post-mortems and incident response.

Nice to Have:

  • Certifications in AWS, Terraform, Kubernetes, or DevOps;
  • Contributions to open-source communities or technical publications.

Similar Jobs

In-Office or Remote
Areia, Paraíba, BRA
130 Employees

Camunda Logo Camunda

Senior Site Reliability Engineer

Artificial Intelligence • Information Technology • Software • Automation
Remote
3 Locations
571 Employees
150K-247K Annually

Alternative Payments Logo Alternative Payments

Senior Site Reliability Engineer

Fintech • Payments • Software • Financial Services
Remote
Brazil
54 Employees
72K-90K Annually

Articul8 AI Logo Articul8 AI

Senior Site Reliability Engineer

Artificial Intelligence • Software
Remote
Brazil
58 Employees
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Sao Paulo, Sao Paulo
652 Employees
Year Founded: 2010

What We Do

RecargaPay is an all-in-one payments superapp based in Brazil that invites people to switch out of autopilot and rethink their finances based on convenience, affordability, and flexibility.
The platform streamlines payments for over 7 million Brazilians by consolidating credit and debit cards, instant payments like Pix, and Open Finance, on a mission to democratize mobile payments and financial services in Brazil.
Featuring services such as bill payments, mobile top-ups, public transportation, installment plans, and loans, designed with convenience, low cost and flexibility in mind. RecargaPay is changing the way both banked and unbanked Brazilians make their everyday payments and access their financial services.
Founded in 2010, having already received over $120 million in funding from investors that include IFC and IADB, RecargaPay is authorized as a Payments Institution and SCD by the Brazilian Central Bank.

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account