Site Reliability Engineer, Associate

Posted 2 Days Ago
Be an Early Applicant
Edinburgh, City of Edinburgh, Scotland
In-Office
Mid level
Fintech • Information Technology • Financial Services
Bringing together tech and market expertise to help people build better financial futures.
The Role

About this role

We’re looking for an SRE with strong Kafka experience and a deep understanding of SRE best practices. You’ll combine hands‑on technical improvements with the ability to delegate work effectively to EventBus developers.

You’ll collaborate closely with the EventBus, Kafka, Telemetry, and Incident Response teams, while also working independently to improve monitoring, reduce noise, strengthen alerting, and track remediation progress.

This role sits at the centre of a global platform used by hundreds of developers and joins a fast‑growing, experienced SRE group based in Edinburgh.

About the Team
The Aladdin EventBus is built on Kafka and enables teams to publish and subscribe to distributed events in near real time. As part of the Aladdin Graph group—a core Platform Engineering function—the EventBus team supports developers across the firm in designing, building, and operating event‑driven and API‑based systems.

EventBus is now a critical dependency for key applications, including our release system and API infrastructure. This drives a high bar for availability, incident responsiveness, and operational excellence. The SRE function supports this by improving observability, streamlining incident processes, and identifying gaps that meaningfully improve platform reliability.

Key Responsibilities:

As the SRE for EventBus, you will drive stability, resiliency, and observability through:

  • Staying informed on all EventBus incidents, including impact, root cause, detection, and ongoing remediation
  • Responding to incidents calmly and efficiently, communicating clearly with reporters and partner teams, and recommending remediations based on urgency and impact
  • Proposing improvements informed by prior incidents, potential risks, and industry standards—e.g., new metrics, SLOs, fallback mechanisms
  • Leading incident retrospectives and sharing insights with the wider team
  • Creating and distributing postmortems for high‑impact operational events
  • Collaborating with developers to write, maintain, and promote runbooks and playbooks
  • Improving alert quality and reducing alert fatigue by tuning signal‑to‑noise ratios
  • Designing and implementing automated recovery solutions for known issues
  • Building a roadmap toward 24/7 availability, rapid failover recovery, self‑detection, and automated resolution of common issues
  • Helping EventBus users diagnose issues with their own producers and consumers

Requirements

  • 3+ years in an SRE role, including experience with defining and managing SLOs
  • Strong understanding of SRE principles (Golden Signals, error budgets, synthetic monitoring, signal‑to‑noise optimisation)
  • Extensive hands‑on experience with Kafka
  • Experience using monitoring tools (Grafana and Splunk preferred), including building dashboards, alerts, and reports

Suggested Requirements

  • Java Developer Experience: Experience with Java or another object‑oriented language
  • CI/CD & Release Management: Experience managing pipelines using Azure DevOps or other Git‑based tools
  • Cloud Experience: Practical experience with at least one public cloud provider, preferably Azure or AWS
  • Agile Development: Familiarity with agile ways of working, sprint ceremonies, and backlog planning
  • Scripting & Automation: Proficiency in Python or Golang for automating operational tasks
  • Monitoring & Observability: Strong understanding of logging, monitoring, and observability practices, including writing integration scripts
  • Collaboration & Communication: Strong cross‑team collaboration skills and excellent written and verbal communication

Our benefits

To help you stay energized, engaged and inspired, we offer a wide range of employee benefits including: retirement investment and tools designed to help you in building a sound financial future; access to education reimbursement; comprehensive resources to support your physical health and emotional well-being; family support programs; and Flexible Time Off (FTO) so you can relax, recharge and be there for the people you care about.

Our hybrid work model

BlackRock’s hybrid work model is designed to enable a culture of collaboration and apprenticeship that enriches the experience of our employees, while supporting flexibility for all. Employees are currently required to work at least 4 days in the office per week, with the flexibility to work from home 1 day a week. Some business groups may require more time in the office due to their roles and responsibilities. We remain focused on increasing the impactful moments that arise when we work together in person – aligned with our commitment to performance and innovation. As a new joiner, you can count on this hybrid model to accelerate your learning and onboarding experience here at BlackRock.

About BlackRock

At BlackRock, we are all connected by one mission: to help more and more people experience financial well-being.  Our clients, and the people they serve, are saving for retirement, paying for their children’s educations, buying homes and starting businesses. Their investments also help to strengthen the global economy: support businesses small and large; finance infrastructure projects that connect and power cities; and facilitate innovations that drive progress.

This mission would not be possible without our smartest investment – the one we make in our employees. It’s why we’re dedicated to creating an environment where our colleagues feel welcomed, valued and supported with networks, benefits and development opportunities to help them thrive.

For additional information on BlackRock, please visit @blackrock | Twitter: @blackrock | LinkedIn: www.linkedin.com/company/blackrock

BlackRock is proud to be an Equal Opportunity Employer.  We evaluate qualified applicants without regard to age, disability, race, religion, sex, sexual orientation and other protected characteristics at law.

Top Skills

AWS
Azure Devops
Go
Grafana
Java
Kafka
Python
Splunk
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
25,000 Employees
Year Founded: 1988

What We Do

As the world’s largest asset manager, BlackRock partners with investors around the globe to help them (and those on whose behalf they invest) plan for life’s most important goals – like retirement, home ownership and their children’s education. Our clients range from governments, foundations and other large institutions to those investing on behalf of individuals, including firefighters, nurses, teachers and factory workers.

BlackRock was founded with the idea of creating a better asset management firm — one that was purpose-driven, focused on clients and risk management, and propelled by data and technology. Our breakthrough Aladdin® platform is BlackRock’s technological backbone, helping investors see and manage their whole portfolios in one place – from constructing investments to monitoring risk and executing trades. Used by hundreds of external institutions around the world, Aladdin combines powerful analytics and a common language to help investment teams make faster, more informed decisions across public and private markets. It’s a key part of our business and one of the reasons we’re trusted to manage more assets than any other investment manager today.

At BlackRock, we challenge conventions and raise the bar for what’s possible. We harness technology to unlock new solutions, simplify complexity, and deliver investment strategies that meet people where they are. Whether it’s retirement planning, wealth building or navigating market shifts, we’re here to help clients invest more easily, more affordably and with more choice as we chart a path toward financial well-being together.

Learn more: Careers.BlackRock.com

Why Work With Us

Without our people, technology is irrelevant. When we combine the power of people with the power of technology, we amplify our ability to create better outcomes for our employees, clients, shareholders and society alike.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

BlackRock Teams

Team
Analytics & Risk
Team
Client & Product
Team
Corporate & Strategic
Team
Investments
Team
Operations
Team
Technology
Team
Early Careers
About our Teams

BlackRock Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

BlackRock has 25,000 employees across more than 100 offices in over 40 countries around the world.

Typical time on-site: 4 days a week
HQNew York, NY
Mexico City
Singapore
Montreal
Toronto
Amsterdam
Atlanta
Bengaluru
Belgrade
Boston
Budapest
Hong Kong
Chicago
Frankfurt
Frankfurt
Gurgaon
Mumbai
Munich
Paris
Princeton
Tokyo
Wilmington
Zurich
Learn more

Similar Jobs

BlackRock Logo BlackRock

Transfer Agency Services, Analyst

Fintech • Information Technology • Financial Services
In-Office
Edinburgh, City of Edinburgh, Scotland, GBR
25000 Employees

BlackRock Logo BlackRock

Associate (GP Services)

Fintech • Information Technology • Financial Services
In-Office
Edinburgh, City of Edinburgh, Scotland, GBR
25000 Employees

BlackRock Logo BlackRock

Admin Business Lead

Fintech • Information Technology • Financial Services
In-Office
2 Locations
25000 Employees

BlackRock Logo BlackRock

Aladdin Accounting Implementations, Vice President

Fintech • Information Technology • Financial Services
In-Office
2 Locations
25000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account