Systems Reliability Engineer SRE, Edge Platform

Posted 12 Days Ago
Hiring Remotely in Austin, TX
Remote
Hybrid
Mid level
Cloud • Information Technology • Security • Software • Cybersecurity
Helping Build a Better Internet
The Role
As a Systems Reliability Engineer, you'll build and maintain the Edge platform across a global network, focusing on automation, scalability, and operational excellence. Your responsibilities include managing service availability, developing tools for performance improvement, and leveraging monitoring tools while enhancing platform capabilities. You'll utilize your coding skills in Go or Python, alongside your knowledge of Linux and networking protocols.
Summary Generated by Built In

About the Role
We are looking for talented Systems Reliability Engineers to build and operate our Edge platform running in more than 320 cities in over 120 countries. Our SREs come from diverse technical backgrounds and have built up their knowledge working in different environments, but common factors across all of our reliability-focused engineers include a passion for automation, scalability, and operational excellence. We support our services in a "follow the sun" model with offices in East Asia, Europe and North America.
This is a superb opportunity to join a high-performing team and scale our high-growth network as Cloudflare's business grows. We live at the boundary between systems, network, and software, and love improving the glue that holds them together. Working with us, you will build tools to constantly improve service availability, performance, and operational velocity. You will nurture a passion for an "automate everything" approach that makes systems failure resistant and ready to scale.
SREs focus on the immediate state and functionality of the Cloudflare platform around the world, leveraging an array of monitoring, alerting and diagnostics tools while developing and enhancing the Cloudflare platform and its capabilities. We own a wide portfolio of applications and services, running a tight feedback loop of developer and operator patterns. The ideal SRE candidate has a passionate curiosity about how the Internet fundamentally works and has a strong knowledge of networking, Linux and TLS along with coding ability in Go or Python.
Requisite Skills

  • Aptitude for identifying problems, owning them and working with others to solve them
  • Linux systems experience
  • 3 years experience in an SRE role or a role with similar functions
  • Software development skills in some programming language such as Go or Python
  • Understanding of distributed software systems and large scale system design tradeoffs
  • Intermediate experience of common network protocols like DNS and HTTP


Examples of desirable skills, knowledge and experience

  • Experience with the Linux kernel and Linux software packaging
  • Performance analysis and debugging
  • Configuration management systems such as Saltstack, Chef, Puppet or Ansible
  • Load balancing and reverse proxies such as Nginx, Varnish, HAProxy, Squid or Apache
  • SQL databases
  • Time series databases such as OpenTSDB, Graphite, Prometheus or Grafana
  • Key/Value stores
  • Internetworking and BGP


Bonus Points

  • Experience with continuous / rapid release engineering
  • Strong tooling and automation development experience
  • Experience working in a 24/7/365 service environment
  • Experience working with large scale production distributed systems
  • A history of contributing to Open Source Software


Some tools that we use

  • Nginx
  • PostgreSQL
  • Docker
  • Prometheus
  • Grafana
  • Consul
  • Nomad
  • Temporal
  • Salt

Top Skills

Go
Python
The Company
HQ: San Francisco, CA
3,900 Employees
Hybrid Workplace
Year Founded: 2010

What We Do

Cloudflare, Inc. (NYSE: NET) is the leading connectivity cloud company on a mission to help build a better Internet. It empowers organizations to make their employees, applications and networks faster and more secure everywhere, while reducing complexity and cost. Cloudflare’s connectivity cloud delivers the most full-featured, unified platform of cloud-native products and developer tools, so any organization can gain the control they need to work, develop, and accelerate their business.

Powered by one of the world’s largest and most interconnected networks, Cloudflare blocks billions of threats online for its customers every day. It is trusted by millions of organizations – from the largest brands to entrepreneurs and small businesses to nonprofits, humanitarian groups, and governments across the globe.

Why Work With Us

Cloudflare employees come from all walks of life. We are mission-driven, and our team is energized by a collaborative, creative environment that celebrates our differences and fosters new ways to grow together.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

Cloudflare Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

We are committed to developing a global team that is distributed with a flexible working approach. Doing this equitably and inclusively is essential to our success. Visit our careers site for more on 'How & Where We Work.'

Typical time on-site: Flexible
HQSan Francisco, CA
Singapore
Austin, TX
Champaign, IL
Lisbon, PT
London, GB
New York, NY
Seattle, WA
Washington, DC
Learn more

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account