Systems Reliability Engineer Hardware Platforms (Networking Focus)

| Remote
Sorry, this job was removed at 11:53 a.m. (CST) on Tuesday, August 30, 2022
Find out who’s hiring remotely Nationwide
See all Remote jobs Nationwide
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

About Us
At Cloudflare, we have our eyes set on an ambitious goal: to help build a better Internet. Today the company runs one of the world's largest networks that powers approximately 25 million Internet properties, for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare all have web traffic routed through its intelligent global network, which gets smarter with every request. As a result, they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was named to Entrepreneur Magazine's Top Company Cultures list and ranked among the World's Most Innovative Companies by Fast Company.
We realize people do not fit into neat boxes. We are looking for curious and empathetic individuals who are committed to developing themselves and learning new skills, and we are ready to help you do that. We cannot complete our mission without building a diverse and inclusive team. We hire the best people based on an evaluation of their potential and support them throughout their time at Cloudflare. Come join us!
About the Team
The Infrastructure Engineering organization is responsible for building software to automate our large fleet of physical hardware, data centers, and network equipment, and serves as the interface layer between our platform and services SRE teams and our physical infrastructure teams. The Core Metal Team focuses on our core data centers, which are some of our largest, most complex, and most mission-critical environments. We have a broad set of responsibilities and are highly collaborative. We work closely with the Core SRE, Edge SRE, Network, Capacity Planning, and platform teams on operational issues, assist the Hardware Engineering team in creating and validating new hardware specifications, and collaborate with the physical infrastructure teams to plan new sites, expansions, and physical maintenance.
About the Role
An engineering role at Cloudflare provides an opportunity to address some big challenges, at scale. We believe that with our talented team, we can solve some of the biggest security, reliability and performance problems facing the Internet. Just how big?

  • We have in excess of 100 TBps of network transit capacity
  • We operate data centers in more than 250 cities in over 100 countries
  • We serve 28 million HTTP requests per second on average, with more than 35 million HTTP requests per second at peak
  • Interconnects with over 10,000 networks globally, including major ISPs, cloud services, and enterprises
  • Anytime we push code, it affects hundreds of millions of internet users
  • More than 1 billion unique IP addresses pass through Cloudflare's network every day


We are looking for talented Systems Reliability Engineers to build and operate the platform which makes Cloudflare customers place their trust in us. Our SREs come from a variety of technical backgrounds and have built up their knowledge working in different environments. But the common factors across all of our reliability-focused engineers include a passion for automation, scalability, and operational excellence. This is a superb opportunity to join a high-performing team and scale our high-growth network as Cloudflare's business grows. You will build tools to constantly improve availability, performance, uptime and response times. You will nurture a passion for an "automate everything" approach that makes systems failure-resistant and ready-to-scale.
Cloudflare SREs focus on either the Core network or the Edge network. This role is focused on the Core network, and is responsible for hardware and data center infrastructure automation and management, building the layer between the physical infrastructure and the services that Engineering and other SREs use on a day-to-day basis. We work from low-level projects (e.g. BIOS, kernel, BMC) to full software development projects (APIs for hardware resource management, provisioning automation, etc) to help manage and scale our infrastructure. This specific role has a focus on networking and related automation.
Examples of dersible skills, knowledge and experience

  • Comprehensive understanding of Linux networking, including systemd-networkd, iproute2, BGP, ECMP, routing, iptables/firewalls, IPv6, ARP, DNS, TLS/SSL, HTTP
  • Prior relevant Site Reliability Engineering, Linux systems administration, Network Engineering, and/or DevOps experience
  • Proficient in one or more programming languages and willing to learn new ones when required (Python, Rust, and Go are the primary languages we use)
  • Experience with the Linux kernel and Linux software packaging
  • Configuration management systems such as Saltstack, Chef, Puppet or Ansible
  • Familiarity with network operating systems such as Cisco IOS, Cisco NX-OS, Juniper JunOS, or Arista EOS
  • Load balancing and reverse proxies such as Nginx, Varnish, HAProxy, Apache
  • SQL databases (Postgres or MySQL)
  • Good understanding of software development fundamentals (e.g. OOP, design patterns)


Bonus Points

  • Experience with continuous / rapid deployment
  • Experience working in a 24/7/365 mission-critical service environment
  • Time series databases and monitoring tools (Prometheus, Graphite, Grafana)
  • Experience building Linux networking software
  • Experience with eBPF
  • Performance analysis and debugging with tools like perf, sar, strace, dtrace
  • Experience working in between the hardware and software interfaces
  • Experience automating bare metal hardware at scale (provisioning, diagnosis and remediation, firmware, observability, etc.)
  • Experience working with IPMI and/or Redfish
  • Familiarity with data center infrastructure--power, cooling, fiber, DCIM


Some tools that we use

  • Python, Rust, Go
  • Salt
  • Nginx
  • PostgreSQL
  • Redis
  • Docker
  • Prometheus
  • Kubernetes
  • Consul


Compensation
Compensation may be adjusted depending on work location. For Colorado-based hires: Estimated annual salary of $168,000 - $206,000.
Equity
This role is eligible to participate in Cloudflare's equity plan.
Benefits
Cloudflare offers a complete package of benefits and programs to support you and your family. Our benefits programs can help you pay health care expenses, support caregiving, build capital for the future and make life a little easier and fun! The below is a description of our benefits for employees in the United States, and benefits may vary for employees based outside the U.S.
Health & Welfare Benefits

  • Medical/Rx Insurance
  • Dental Insurance
  • Vision Insurance
  • Flexible Spending Accounts
  • Commuter Spending Accounts
  • Fertility & Family Forming Benefits
  • On-demand mental health support and Employee Assistance Program
  • Global Travel Medical Insurance


Financial Benefits

  • Short and Long Term Disability Insurance
  • Life & Accident Insurance
  • 401(k) Retirement Savings Plan
  • Employee Stock Participation Plan


Time Off

  • Flexible paid time off covering vacation and sick leave
  • Leave programs, including parental, pregnancy health, medical, and bereavement leave


What Makes Cloudflare Special?
We're not just a highly ambitious, large-scale technology company. We're a highly ambitious, large-scale technology company with a soul. Fundamental to our mission to help build a better Internet is protecting the free and open Internet.
Project Galileo: We equip politically and artistically important organizations and journalists with powerful tools to defend themselves against attacks that would otherwise censor their work, technology already used by Cloudflare's enterprise customers--at no cost.
Athenian Project: We created Athenian Project to ensure that state and local governments have the highest level of protection and reliability for free, so that their constituents have access to election information and voter registration.
Path Forward Partnership: Since 2016, we have partnered with Path Forward, a nonprofit organization, to create 16-week positions for mid-career professionals who want to get back to the workplace after taking time off to care for a child, parent, or loved one.
1.1.1.1: We released 1.1.1.1 to help fix the foundation of the Internet by building a faster, more secure and privacy-centric public DNS resolver. This is available publicly for everyone to use - it is the first consumer-focused service Cloudflare has ever released. Here's the deal - we don't store client IP addresses never, ever. We will continue to abide by our privacy commitment and ensure that no user data is sold to advertisers or used to target consumers.
Sound like something you'd like to be a part of? We'd love to hear from you!
This position may require access to information protected under U.S. export control laws, including the U.S. Export Administration Regulations. Please note that any offer of employment may be conditioned on your authorization to receive software or technology controlled under these U.S. export laws without sponsorship for an export license.
Cloudflare is proud to be an equal opportunity employer. We are committed to providing equal employment opportunity for all people and place great value in both diversity and inclusiveness. All qualified applicants will be considered for employment without regard to their, or any other person's, perceived or actual race, color, religion, sex, gender, gender identity, gender expression, sexual orientation, national origin, ancestry, citizenship, age, physical or mental disability, medical condition, family care status, or any other basis protected by law. We are an AA/Veterans/Disabled Employer.
Cloudflare provides reasonable accommodations to qualified individuals with disabilities. Please tell us if you require a reasonable accommodation to apply for a job. Examples of reasonable accommodations include, but are not limited to, changing the application process, providing documents in an alternate format, using a sign language interpreter, or using specialized equipment. If you require a reasonable accommodation to apply for a job, please contact us via e-mail at [email protected] or via mail at 101 Townsend St. San Francisco, CA 94107.

Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Technology we use

  • Engineering
  • Product
  • Sales & Marketing
  • People Operations
    • C++Languages
    • GolangLanguages
    • JavascriptLanguages
    • PythonLanguages
    • RLanguages
    • SqlLanguages
    • rustLanguages
    • ReactLibraries
    • ConfluenceManagement
    • JIRAManagement
    • SmartsheetManagement
    • SalesforceCRM
    • Google HangoutsCollaboration
    • Google MeetCollaboration

What are Cloudflare Perks + Benefits

Culture
Volunteer in local community
Open door policy
OKR operational model
Team based strategic planning
Pair programming
Open office floor plan
Employee resource groups
Hybrid work model
Employee awards
Flexible work schedule
Remote work program
Diversity
Dedicated diversity and inclusion staff
Mandated unconscious bias training
Diversity employee resource groups
Cloudflare has many ERGs including Afroflare, Latinflare, Nativeflare, Asianflare, Desiflare, Womenflare, Women in Engineering, Proudflare, Vetflare, and more.
Hiring practices that promote diversity
Health Insurance & Wellness Benefits
Flexible Spending Account (FSA)
Disability insurance
Dental insurance
Vision insurance
Health insurance
Life insurance
Pet insurance
Wellness programs
Team workouts
Financial & Retirement
401(K)
Company equity
Employee stock purchase plan
Child Care & Parental Leave Benefits
Childcare benefits
Generous parental leave
Family medical leave
Return-to-work program post parental leave
Vacation & Time Off Benefits
Generous PTO
We encourage employees to find a comfortable work-life balance by taking as many days off as they need while still being able to perform their jobs satisfactorily. (We really mean it!)
Paid volunteer time
As part of Pledge 1%, Cloudflare has committed to donate 1% of our team's time to volunteering. To meet that goal, we offer all employees 3 days additional annual leave to volunteer in their community
Sabbatical
Paid holidays
Paid sick days
Flexible time off
Office Perks
Commuter benefits
Company-sponsored outings
Free snacks and drinks
Some meals provided
Company-sponsored happy hours
Relocation assistance
Fitness stipend
Professional Development Benefits
Job training & conferences
Lunch and learns
Promote from within
Online course subscriptions available

More Jobs at Cloudflare

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about CloudflareFind similar jobs like this