Senior Site Reliability Engineer, Observability

Reposted 2 Hours Ago
9 Locations
In-Office or Remote
Senior level
Blockchain • Internet of Things • Payments • Cryptocurrency • Web3
Chainlink Labs is where the bold belong.
The Role
As a Senior Site Reliability Engineer, you'll build observability platforms, support telemetry types, ensure reliability and security, and collaborate with engineers to deploy services.
Summary Generated by Built In

About Us 

Chainlink Labs is one of the primary contributing developers of Chainlink, the industry-standard oracle platform bringing the capital markets onchain and powering the majority of decentralized finance. The Chainlink stack provides the essential data, interoperability, compliance, and privacy standards needed to power advanced blockchain use cases for institutional tokenized assets, Decentralized Finance (DeFi), payments, stablecoins, and more. Many of the world’s largest financial services institutions have also adopted Chainlink’s standards and infrastructure, including Swift, Euroclear, Mastercard, Fidelity International, UBS, ANZ, Aave, GMX, Lido, and many others.

Chainlink Labs is a world-class team of over 600 developers, researchers, and capital markets experts, and has ranked among Fortune's Best Workplaces in Technology, Fortune's Best Medium Workplace, and the Top 100 Global Most Loved Workplaces. Learn more at chain.link or chainlinklabs.com.

The Observability Team enables Chainlink development and empowers engineers to continue building and supporting crucial products and services that have a profound impact in the blockchain industry. Reliability is vital to the success of our company. As a Senior SRE, you will help us accelerate and enable other engineering teams by increasing self-service and decreasing cognitive load.

This job would be perfect for someone who has a strong DevOps mentality, is passionate about building and maintaining a mature GitOps environment, and has experience focusing on observability. The entire engineering team is expanding, and you would have plenty of opportunities to build, learn, and grow.

We all have different backgrounds and are determined to help you succeed no matter where you are or who you are. If you think you would do a great job at Chainlink, we are looking forward to speaking with you, even if you don't match 100% of the job requirements: those describe people we've usually had a great time working with, but they're not a tick-box exercise.

Your Impact

  • Build and orchestrate Modern OTEL-based Observability Platform

  • Support multiple telemetry types, like metrics, logs and traces.

  • Define and support modern governance in observability and problems at scale.

  • Ensure reliability, security, and performance exceed our defined SLAs

  • Work with engineers from across the company to help troubleshoot issues, deploy new products and services, and increase velocity while decreasing cognitive load

  • Lead the design and deployment of monitoring/observability services to detect and alert the team of needed action.

  • Ingest, aggregate, transform, and utilize data from a multitude of sources in our real time data pipeline.

  • Oversee the availability, performance, and supportability of our observability infrastructure.

  • Create processes around alert response operations and support the team to ensure the reliable delivery of oracle data.

  • Make recommendations to ensure sufficient metrics are collected to create alerts with every new feature release.

  • Champion reliability and security by taking the time to do your work right the first time

Requirements

  • 7+ years of relevant professional experience. You probably have worked on a devops, infrastructure, SRE, and/or platform team before

  • Ability to develop software outside of the scope of typical infrastructure requirements and configurations

  • Experience programming in C, C++, Java, Python, Go, Perl, or Ruby

  • Expert knowledge in all aspects of designing, developing, and managing large real-time systems

  • Experience with monitoring and logging. You know how to export metrics using Prometheus, have built a Grafana dashboard or two, and have experience with a centralized logging solution like an ELK Stack, Splunk or Grafana Stack.

  • Experience with distributed systems and container orchestration. You have maintained or even built Kubernetes clusters before and feel comfortable deploying completely new services on them

  • Strong communication skills. You can give and receive constructive feedback, and you do not shy away from planning meetings and code reviews

Desired Qualifications

  • Excitement for blockchain, Web 3.0, and similar decentralized technologies.

  • Experience running any infrastructure in the blockchain/web3 space

  • Ability to scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity

  • Experience working remotely in a distributed team

  • A strong desire to grow and challenge yourself. We would expect you to constantly find ways to improve and automate services to reduce toil

Some of the tools and services we use daily or almost daily are:

  • AWS; Terraform/Terragrunt; Kubernetes, Calico and ArgoCD; Prometheus and Grafana; GitHub Actions; Packer

  • We expect you to be comfortable with most of those tools and very proficient in several of them.

All roles with Chainlink Labs are global and remote-based. Unless otherwise stated, we ask that you try to overlap some working hours with Eastern Standard Time (EST).

We carefully review all applications and aim to provide a response to every candidate within two weeks after the job posting closes. The closing date is listed on the job advert, so we encourage you to take the time to thoughtfully prepare your application. We want to fully consider your experience and skills, and you will hear from us regarding the status of your application shortly after the closing date.

Commitment to Equal Opportunity

Chainlink Labs is an equal opportunity employer. All qualified applicants will receive equal consideration for employment in compliance with applicable laws, regulations, or ordinances. If you need assistance or accommodation due to a disability or special need when applying for a role or in our recruitment process, please contact us via this form.

Global Data Privacy Notice for Job Candidates and Applicants

Information collected and processed as part of your Chainlink Labs Careers profile, and any job applications you choose to submit is subject to our Privacy Policy. By submitting your application, you are agreeing to our use and processing of your data as required.

Top Skills

AWS
C
C++
Elk Stack
Github Actions
Go
Grafana
Java
Kubernetes
Packer
Perl
Prometheus
Python
Ruby
Splunk
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
680 Employees
Year Founded: 2017

What We Do

Chainlink Labs is one of the primary contributing developers of Chainlink, the backbone of blockchain. Chainlink is unifying liquidity across global markets and has enabled over $20 trillion in transaction value across the blockchain economy. Major financial market infrastructures and institutions, such as Swift, Fidelity International, and ANZ Bank, as well as top DeFi protocols including Aave, GMX, and Lido, use Chainlink to power next-generation applications for banking, asset management, and other major sectors.

Why Work With Us

Chainlink Labs is designed for the bold—those who aren’t afraid to challenge the status quo and drive innovation. Here, you're not just stepping into your next role; you're becoming part of a seismic shift in how the world operates.

Gallery

Gallery

Similar Jobs

Prove Logo Prove

Senior Site Reliability Engineer

Fintech • Mobile • Security • Software • Cybersecurity
Remote
United States
320 Employees
165K-180K

NVIDIA Logo NVIDIA

Senior Site Reliability Engineer

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office or Remote
2 Locations
21960 Employees
144K-270K
Remote
30 Locations
393 Employees
179K-179K

MongoDB Logo MongoDB

Senior Site Reliability Engineer

Big Data • Cloud • Software • Database
Easy Apply
Remote or Hybrid
United States
5550 Employees
127K-249K Annually

Similar Companies Hiring

HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
Rain Thumbnail
Web3 • Payments • Infrastructure as a Service (IaaS) • Fintech • Financial Services • Cryptocurrency • Blockchain
New York, NY
40 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account