Principal Site Reliability Engineer, Network Observability

Sorry, this job was removed at 08:19 p.m. (CST) on Friday, Jul 11, 2025
Hiring Remotely in United States
Remote or Hybrid
115K-164K Annually
Cloud • Enterprise Web • Information Technology • Other
We Connect What's Next
The Role

Company Description

Zayo provides mission-critical bandwidth to the world’s most impactful companies, fueling the innovations that are transforming our society. Zayo’s 141,000-mile network in North America and Europe includes extensive metro connectivity to thousands of buildings and data centers. Zayo’s communications infrastructure solutions include dark fiber, private data networks, wavelengths, Ethernet, and dedicated Internet access. Zayo serves wireless and wireline carriers, media, tech, content, finance, healthcare and other large enterprises.

Do you dream in high scalable systems, thrive in fast-paced environments and enjoy tackling complex technical challenges? Are you passionate about diving into the details and making the most accurate and durable network observability systems? If so, then join our team as a Principal Site Reliability Engineer, Network Observability!

We're looking for a talented Principal Site Reliability Engineer, Network Observability to play a critical role in ensuring the uptime, performance, and scalability of our network with a focus on our network observability systems.

Responsibilities:

  • Automation: Work with the NOC and software engineering teams to discover processes around network observability that can be automated, and then create a technical plan to implement both the the technical and process changes.

  • Monitoring and Alerting: Work with the network observability team to design and implement effective monitoring and alerting to proactively identify and address issues.

  • Incident Management: Own the incident lifecycle, from leading root cause analysis and resolution to implementing preventative measures to avoid future occurrences. Focus on chronic and big picture issues that may have complex resolutions spanning departments, process, and technical elements.

  • Reliability Engineering: Proactively identify and mitigate potential system risks, focusing on automation, monitoring, and tooling to ensure high service availability.

  • Scalability and Performance: Design and implement solutions to ensure our infrastructure can handle ever-growing demands while maintaining optimal application performance and providing the best possible detail on service degradation and outages to the NOC. Have a laser focus on reducing mean time it takes for the NOC to correctly diagnose issues and automate troubleshooting and information collection.

  • Collaboration: Work closely with developers, product managers, and engineers to translate business needs into robust and reliable technical solutions. Become the beacon for best practices and efficient processes throughout the organization.

Qualifications:

  • Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience.)

  • Minimum of twelve (12) years of experience in a Senior Network Engineer, Senior Site Reliability Engineer or related role.

  • Strong understanding of system administration, Linux, and proficiency in scripting languages (Python and various shells.)

  • Previous experience working both in a NOC and in an upper level network engineering role.

  • Exceptionally strong working knowledge of networking concepts and application protocols, especially TCP/IP, BGP, DNS, TLS, and HTTP/S and network services.

  • Expert at developing automation tools for monitoring, alerting, and deployment to ensure efficient and reliable operations.

  • Expert at designing and implementing monitoring systems at scale.

  • Experience with various monitoring platforms such as SevOne, Assure1,Prometheus, and Nagios and various vendor EMS/NMS systems.

  • Previous work in large scale distributed production environments.

  • Experience with a variety of cloud platforms and tools (AWS, Google, etc.)

  • Experience with a variety of monitoring and alerting tools (Grafana, Cacti, etc.)

  • Proven leadership skills, with the ability to mentor and inspire others.

  • Excellent problem-solving, analytical, and critical thinking skills.

  • A passion for automation and building efficient systems.

  • Expert experience working in a highly automated environment.

Preferred Experience:

  • Experience working with various vendor APIs (or netconf) including Nokia, Juniper, Fujitsu, Infinera, Cisco, and Ciena.

  • Experience with various network orchestration platforms such as Ciena Blue Planet MDSO, Cisco NSO, Nokia NSP, or others.

  • Experience automating network troubleshooting.

Estimated Base Salary Range: $114,900 - $164,200 USD/annually.

The base pay range shown is a guideline and reasonable estimate for this role. It takes into account the wide variety of factors that are considered in making compensation decisions. Actual compensation offered may vary from the posted range based upon geographic location, work experience, skill level, certifications, and other business and organizational needs. Non- sales roles may be eligible to participate in a discretionary annual incentive plan. Sales roles may be eligible to participate in a sales incentive plan.

Additionally, this position may be eligible for certain benefits, such as health insurance, life insurance, disability retirement plans, paid time off.

The posting will be active for a minimum of 3 days. The active posting will continue to extend by 3 days until the position is filled.

Benefits, Rewards & Wellness

  • Excellent Health, Dental & Vision Insurance

  • Retirement 401(k) Savings Plan

  • Generous paid time off policy including paid parental leave

Zayo provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, provincial or local laws.

This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.

Similar Jobs

ServiceNow Logo ServiceNow

Consultant

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
West Palm Beach, FL, USA
28000 Employees

ServiceNow Logo ServiceNow

Senior Digital Experience Lead

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
New York, NY, USA
28000 Employees
155K-272K Annually

ServiceNow Logo ServiceNow

Director, Strategic Deal Development

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
West Palm Beach, FL, USA
28000 Employees

ServiceNow Logo ServiceNow

Senior Manager, Audience Planning, RIsk & Security

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Chicago, IL, USA
28000 Employees
140K-245K Annually
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Boulder, CO
4,000 Employees
Year Founded: 2007

What We Do

Zayo Group Holdings, Inc. is a leading global communications infrastructure platform, delivering a range of solutions, including fiber & transport, packet and managed edge services. Zayo owns and operates a Tier 1 IP backbone spanning 134,000 miles across North America and Europe. By providing this mission-critical bandwidth to its category-leading customers across the wireless, hyperscale, media, tech and finance industries, Zayo is fueling the innovations that are transforming society. For more information, visit https://zayo.com.

Why Work With Us

We are ambitious and collaborative. Our culture is centered on excellence and exceeding customer expectations through high performance, big ideas, and a growth mindset.

Gallery

Gallery

Similar Companies Hiring

Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
15 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account