NewDay

Senior Site Reliability Engineer

Posted 23 Days Ago

Be an Early Applicant

London, Greater London, England, GBR

In-Office

Senior level

Software • Financial Services

The Role

Lead reliability initiatives across the platform by automating infrastructure and operational processes, building observability (monitoring, logging, tracing), driving incident management and root cause analysis, and collaborating with engineering teams to embed SRE practices, resilience, and performance into delivery.

Summary Generated by Built In

Mission Statement & Summary

As a Senior Site Reliability Engineer, you'll sit at the intersection of software engineering and operations, driving reliability, performance, automation, and resilience across our technology estate.

This is an opportunity to shape the future of our platform rather than simply maintain it. You'll work alongside talented engineers, influence technical direction, and champion modern reliability practices that enable teams to move faster with confidence. If you're passionate about solving complex problems, eliminating toil through automation, and creating systems that are resilient by design, we'd love to hear from you.

How you'll contribute

Lead initiatives that improve platform reliability, scalability, and operational excellence.
Design and deliver automation solutions that reduce manual effort and accelerate engineering teams.
Develop observability capabilities, enabling proactive monitoring and faster incident resolution.
You will facilitate incident management, driving root cause analysis and continuous improvement.
You'll collaborate with engineering teams to embed reliability, resilience, and performance into every stage of delivery.
You will contribute to a large scale migration to OpenTelemetry

We're looking for these essential skills

Software engineering and design experience (preferably .net/C#), to build and improve production systems, apply solid design principles, and contribute directly to codebases to deliver reliable, scalable, and maintainable services.
The ability to automate infrastructure, operational processes, and deployments using modern engineering practices.
Experience building effective observability solutions, including monitoring, logging, alerting, and tracing.
Strong problem-solving skills with the ability to diagnose and resolve complex production issues.
The ability to influence technical decisions and collaborate effectively across engineering and business teams.
Experience with instrumenting via OpenTelemetry

It's a plus if you also have these skills

Experience operating Kubernetes-based platforms at scale.
Knowledge of Infrastructure as Code tools and cloud platform services.
Experience implementing Site Reliability Engineering principles, including SLOs, SLIs, and error budgets.
Familiarity with security, compliance, and resilience best practices within cloud environments.
Experience mentoring engineers and helping teams adopt modern operational and reliability practices.

At NewDay, we value all types of diversity. We’re an equal opportunity employer and believe that our differences create a vibrant, authentic working culture. We want all our colleagues to feel able to bring their whole selves to work. We don’t discriminate on the basis of protected characteristics or identities. We make sure that every job is crafted to be inclusive and that people with disabilities or caring responsibilities can take part in the application and interview process.

Tell us if you need accommodations: We’ll put reasonable adjustments in place to support you.

We work with Textio to make our job design and hiring inclusive.

PermanentSenior SRE role profile.docx

Skills Required

Software engineering and design experience (preferably .net/C#) to build and improve production systems
Ability to automate infrastructure, operational processes, and deployments using modern engineering practices
Experience building observability solutions, including monitoring, logging, alerting, and tracing
Strong problem-solving skills with ability to diagnose and resolve complex production issues
Ability to influence technical decisions and collaborate effectively across engineering and business teams
Experience operating Kubernetes-based platforms at scale
Knowledge of Infrastructure as Code tools and cloud platform services
Experience implementing Site Reliability Engineering principles, including SLOs, SLIs, and error budgets
Familiarity with security, compliance, and resilience best practices within cloud environments
Experience mentoring engineers and helping teams adopt modern operational and reliability practices

View all jobs at NewDay

View NewDay Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: London

1,150 Employees

Year Founded: 2001

What We Do

At NewDay, our business is focused on a single, clear and defining purpose: to help people move forward with credit. We provide nearly 4 million customers with responsible access to credit, underpinned by best-in-class customer service and exceptional user experience. Our in-house, highly scalable digital platform alongside our proprietary credit decisioning capability unlocks our competitive advantage. Our broad credit product offering enables instalment finance, BNPL, 0% finance and carded and digital revolving credit. We operate multiple direct-to-consumer products and a range of credit solutions with some of the UK’s most loved brands. Our underwriting capability and experience allow us to responsibly say yes to more UK customers making us a merchant partner of choice. We partner to harness the power of data to drive commerce across the UK, creating value at scale.