Systems Engineer, Metrics and Alerting

Reposted Yesterday
Be an Early Applicant
2 Locations
Hybrid
Junior
Cloud • Information Technology • Security • Software • Cybersecurity
Helping Build a Better Internet
The Role
Design, deliver, and operate software for observability; solve scaling issues in Metrics & Alerting; participate in on-call rotation and mentorship.
Summary Generated by Built In
Available Locations: London or Lisbon
About the Department
Production Engineering is responsible for the world's most reliable, observable, performant, and safe network ecosystem. Our customers rely on our products and systems to safely modify, troubleshoot, and release products without external impact.
Our external customers rely on us to provide seamless and predictable incident, traffic, policy management, resulting in the fastest and safest network services in the world.
We are accountable for the overall performance of internal and external facing services, guiding our product teams to optimal configurations and maximum efficiency. From the moment that a packet enters the Cloudflare ecosystem, we know exactly what its expected purpose and behaviour is and we are capable of determining and exposing anomalous behaviour.
The Cloudflare network makes it possible to solve challenges at massive scale and efficiency which would be impossible for almost any other organization.
About the Team
This role is for the internal Observability Team, responsible for the observability platform and stack to make our engineering teams productive. This includes (but is not limited to) areas like metrics, alerting, error tracking, logging, tracing, and more.
In this role, you can expect to:
  • Design, deliver, and operate software and a platform that progresses Cloudflare's Observability competency
  • Solve scaling bottlenecks in critical services in our Metrics & Alerting pipeline
  • Work on highly distributed and scalable systems
  • Participate in the constant cycle of knowledge sharing and mentoring
  • Participate in the global on-call rotation for the services your team owns
  • Research and introduce cutting-edge technologies
  • Contribute to open-source

We are a small team, well-funded, growing and focused on building an extraordinary company. This is a software engineering/systems engineering role and is a superb opportunity to be part of a high performing team to help to support Cloudflare's mission and help build a better internet.
You may be a good fit for our team if you have:
  • A Software Engineering background and proficiency in high-level programming languages (e.g., Go)
  • Proficiency in Data structures and databases like TSDBs, Columnar stores or related
  • Proficiency in distributed Linux environments
  • Proficiency in designing high-scale distributed systems
  • Proficiency in Prometheus, Alertmanager, Thanos
  • Experience working in a fast, high-growth environment
  • Experience working in a 24/7/365 service environment
  • Exquisite written and verbal communication skills
  • Familiarity with Internetworking, networking protocols Layer 2-7 of the OSI model and BGP
  • Strong bias for action

Bonus points if you have:
  • Experience with high-bandwidth transit Internetworking and routing
  • Passion for code simplicity and performance

Top Skills

Alertmanager
Go
Linux
Prometheus
Thanos
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
4,400 Employees
Year Founded: 2010

What We Do

Cloudflare, Inc. (NYSE: NET) is the leading connectivity cloud company on a mission to help build a better Internet. It empowers organizations to make their employees, applications and networks faster and more secure everywhere, while reducing complexity and cost. Cloudflare’s connectivity cloud delivers the most full-featured, unified platform of cloud-native products and developer tools, so any organization can gain the control they need to work, develop, and accelerate their business.

Powered by one of the world’s largest and most interconnected networks, Cloudflare blocks billions of threats online for its customers every day. It is trusted by millions of organizations – from the largest brands to entrepreneurs and small businesses to nonprofits, humanitarian groups, and governments across the globe.

Why Work With Us

Cloudflare employees come from all walks of life. We are mission-driven, and our team is energized by a collaborative, creative environment that celebrates our differences and fosters new ways to grow together.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

Cloudflare Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

We are committed to developing a global team that is distributed with a flexible working approach. Doing this equitably and inclusively is essential to our success. Visit our careers site for more on 'How & Where We Work.'

Typical time on-site: Flexible
HQSan Francisco, CA
Singapore
Austin, TX
Bengaluru, Karnataka
Boston, MA
Champaign, IL
Denver, Colorado
Lisbon, PT
London, GB
Los Angeles, CA
New York, NY
Seattle, WA
Washington, DC
Learn more

Similar Jobs

Cloudflare Logo Cloudflare

Product Manager

Cloud • Information Technology • Security • Software • Cybersecurity
Hybrid
4 Locations
4400 Employees
166K-224K Annually

Cloudflare Logo Cloudflare

Senior Product Manager

Cloud • Information Technology • Security • Software • Cybersecurity
Hybrid
2 Locations
4400 Employees

Cloudflare Logo Cloudflare

Account Executive

Cloud • Information Technology • Security • Software • Cybersecurity
Hybrid
4 Locations
4400 Employees

Cloudflare Logo Cloudflare

Solutions Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Hybrid
5 Locations
4400 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account