Senior Site Reliability Engineer

Posted 22 Days Ago
Be an Early Applicant
Hiring Remotely in Ciudad De México, MEX
Remote
Senior level
Fintech • Payments • Financial Services
The Role
The Senior Site Reliability Engineer will design resilient systems, define SLIs/SLOs, improve incident responses, and mentor engineers while ensuring operational excellence and system performance.
Summary Generated by Built In
About EarnIn

As one of the first pioneers of earned wage access, our passion at EarnIn is building products that deliver real-time financial flexibility for those with the unique needs of living paycheck to paycheck. Our community members access their earnings as they earn them, with options to spend, save, and grow their money without mandatory fees, interest rates, or credit checks.

We’re fortunate to have an incredibly experienced leadership team, combined with world-class funding partners like A16Z, Matrix Partners, DST, Ribbit Capital, and a very healthy core business with a tremendous runway. We’re growing fast and are excited to continue bringing world-class talent onboard to help shape the next chapter of our growth journey.

WHY this role exists

EarnIn’s community members rely on our products to perform consistently, respond promptly, and instill trust. Reliability goes beyond infrastructure; it shapes the customer experience. Product teams must deploy rapidly, but they must also develop systems that are observable, resilient, easy to operate, and safe to update.

This role exists to elevate the reliability of EarnIn’s production systems while empowering engineering teams to advance swiftly with assurance. As a Senior Site Reliability Engineer, you will spearhead reliability enhancements that fortify services, streamline incident management, and foster sustainable on-call practices.

HOW you will create impact

  • Act as a senior technical owner for reliability initiatives. Collaborate across systems, teams, and failure modes to strengthen how EarnIn designs, observes, deploys, and manages production services.
  • You will combine software engineering fundamentals with reliability thinking. Rather than just responding to incidents, you will apply lessons learned to improve systems, alerts, runbooks, and ownership, reducing repeat failures.
  • Leverage AI-assisted engineering practices, such as machine learning monitoring tools and anomaly detection systems, to minimize operational toil, accelerate investigations, refine infrastructure workflows, and enable teams to analyze production behavior more effectively.
  • Mentor engineers and coach product teams to embed reliability practices that clarify, streamline, and safeguard their services.

WHAT you will own

Reliable system design

  • Engineer and refine systems focusing on resilience, graceful degradation, capacity, and understanding failure modes.
  • Collaborate with engineering teams to surface and address reliability risks during design, implementation, launch, and operation.
  • Transform services to be simpler to debug, easier to operate, and more predictable under failure.

SLOs, observability, and production signals

  • Define and measure SLIs and SLOs that reflect real customer experience.
  • Apply observability tools such as Datadog, CloudWatch, logs, metrics, traces, and APM to create signal-rich, noise-light operational visibility.
  • Elevate alerting quality so pages drive action, reach the right people, and warrant human intervention.

Incident lifecycle improvement

  • Direct and optimize incident response practices from detection and triage to communication, resolution, postmortems, and follow-up.
  • Extract incident learnings to implement lasting technical and process improvements.
  • Guide teams to reduce repeated incidents and cultivate a quieter on-call environment.

Operational tooling and AI-assisted leverage

  • Develop or refine tooling that eliminates toil, accelerates root-cause analysis, and streamlines infrastructure-as-code workflows.
  • Apply AI-assisted development and operational workflows responsibly to hasten investigations, enhance documentation, evolve runbooks, and automate repetitive engineering tasks.
  • Help teams adopt practical AI-assisted workflows where they measurably improve quality, speed, or operational clarity.

Mentorship and engineering enablement

  • Coach engineers in reliability practices, observability, incident response, and production ownership.
  • Write documentation and runbooks that reduce silos and make operational knowledge easier to use.
  • Articulate reliability tradeoffs persuasively to both technical and non-technical partners.

WHAT we're looking for

  • Bachelor’s or master’s degree in Computer Science or equivalent industry experience.
  • 4+ years of experience in SRE, Software Engineering, Infrastructure Engineering, or a related role.
  • Hands-on coding experience in Python, Go, or similar languages.
  • Experience designing, operating, and improving distributed systems in production.
  • Strong understanding of SLIs, SLOs, error budgets, MTTR, incident response, and how to use reliability data to drive decisions.
  • Strong observability and debugging skills using logs, metrics, traces, dashboards, and production signals.
  • Experience improving alert quality, runbooks, incident processes, and follow-through after production issues.
  • Ability to lead reliability initiatives across teams and mentor engineers toward better operational practices.
  • Experience using AI-assisted development or operational tools, such as GitHub Copilot or Datadog

#LI-Hybrid

#LI-Remote

At EarnIn, we believe that the best way to build a financial system that works for everyday people is by hiring a team that represents our diverse community. Our team is diverse not only in background and experience but also in perspective. We celebrate our diversity and strive to create a culture of belonging. EarnIn does not unlawfully discriminate based on race, color, religion, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), gender identity, gender expression, national origin, ancestry, citizenship, age, physical or mental disability, legally protected medical condition, family care status, military or veteran status, marital status, registered domestic partner status, sexual orientation, genetic information, or any other basis protected by local, state, or federal laws. EarnIn is an E-Verify participant. 

EarnIn does not accept unsolicited resumes from individual recruiters or third-party recruiting agencies in response to job postings. No fee will be paid to third parties who submit unsolicited candidates directly to our hiring managers or HR team.

Skills Required

  • Bachelor's or Master's degree in Computer Science or equivalent industry experience
  • 4+ years of experience in an SRE or Software Engineering role
  • Hands-on coding experience in Python and/or Go
  • Proven experience designing and operating large-scale distributed systems
  • Deep fluency in SLOs, SLIs, error budgets, and MTTR
  • Calm under pressure with skills in diagnosing incidents from logs and metrics
  • Ability to work across technical and non-technical teams
  • Selects and utilizes operational tools efficiently
  • Ability to lead strategic reliability initiatives and mentor engineers
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Palo Alto, CA
229 Employees
Year Founded: 2012

What We Do

Earnin’s mission is to build a financial system that works for people. Every year, while Americans wait for their paychecks, more than $1 trillion of their hard-earned money is held up in the pay cycle. As a result, we accumulate over $50 billion in late and overdraft fees and turn to high-interest loans. We seek to eliminate those fees and put money back into workers’ hands. Our financial system doesn’t work for people. But Earnin does. Earnin is an app that lets people get paid as soon as they leave work, with no fees, interest, or hidden costs. App users can receive their money in their bank account instantly at little or no cost — as we operate on a pay what you choose model. All they need is a bank account and a job that provides direct deposit or uses electronic timesheets. At Earnin, we’re building the way we think a financial system should work for everyone, not just the people who can afford it. We help people take control of their money and get to a better financial place. Our goal is not only to provide great products at little or no cost to the people who need them but also to inspire kindness across the financial world and eventually across all industries.

Similar Jobs

Nexaminds Logo Nexaminds

Senior Site Reliability Engineer

Artificial Intelligence • Cloud • Information Technology • Machine Learning • Natural Language Processing • Consulting
Remote
México
80 Employees

Circle (circle.so) Logo Circle (circle.so)

Senior Site Reliability Engineer

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Easy Apply
Remote
31 Locations
250 Employees
130K-140K Annually
Remote
2 Locations
13042 Employees

IO Connect Services Logo IO Connect Services

Senior Site Reliability Engineer

Cloud • Information Technology • Consulting
Remote
México
84 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account