Site Reliability Engineer

Reposted 8 Days Ago
Be an Early Applicant
Hiring Remotely in United Kingdom
Remote
Mid level
Cloud • Software • Analytics
The Role
The Site Reliability Engineer manages incident response, enhances service reliability, automates operations, and collaborates across teams to improve system design.
Summary Generated by Built In

At NiCE, we don’t limit our challenges. We challenge our limits. Always. We’re ambitious. We’re game changers. And we play to win. We set the highest standards and execute beyond them. And if you’re like us, we can offer you the ultimate career opportunity that will light a fire within you.

So, what’s the role all about?

The SRE – NOC role sits at the intersection of traditional Network Operations Center (NOC) responsibilities and engineering‑driven reliability practices. This role focuses on 24/7 service reliability, incident response, operational automation, and observability, while actively reducing operational toil through software and automation.

Unlike a traditional NOC analyst, an SRE‑NOC is expected to engineer problems away, not just respond to alerts.

 

How will you make an impact?

Incident Response & Operations

  • Act as a primary or escalation responder in a 24x7 on‑call rotation
  • Lead or support Major Incident (MI) response, including triage, mitigation, and resolution
  • Coordinate across Engineering, Infrastructure, Security, and Product teams
  • Execute and improve runbooks, playbooks, and escalation paths
  • Drive blameless post‑incident reviews (PIRs) and track corrective actions

 

Monitoring, Alerting & Observability

  • Own service health monitoring across infrastructure, applications, and dependencies
  • Design and maintain alerting strategies that align with SLIs/SLOs
  • Reduce alert fatigue through signal‑to‑noise improvements
  • Build dashboards using tools such as:
    • Grafana
    • Prometheus
    • Datadog / Splunk / CloudWatch

Reliability Engineering & Automation

  • Automate repetitive operational tasks to reduce manual toil
  • Improve mean time to detect (MTTD) and mean time to resolve (MTTR)
  • Develop scripts and tools (Python, Bash, Go, etc.) to support NOC/SRE workflows
  • Implement self‑healing and auto‑remediation where possible
  • Partner with engineering teams to improve system design for reliability

 

Platform & Infrastructure Support

  • Support and troubleshoot:
    • Linux‑based systems
    • Cloud platforms (AWS, Azure, GCP)
    • Kubernetes / containerized environments
  • Assist with capacity planning and availability reviews
  • Ensure operational readiness for production releases

 

Have you got what it takes?

Technical

  • Strong Linux systems administration
  • Experience with incident management and production support
  • Familiarity with:
    • Cloud infrastructure (AWS preferred)
    • Containers & orchestration (Docker, Kubernetes)
    • Monitoring/alerting platforms
  • Scripting or programming experience in Python, Bash, Go, or similar
  • Understanding of networking fundamentals (DNS, TCP/IP, load balancing)

 

Operational

  • Experience working in 24x7 NOC or production operations environments
  • Ability to handle high‑pressure incidents calmly and effectively
  • Strong written and verbal communication for incident coordination
  • Comfort working from runbooks—but improving them when they fall short

 

Preferred / Differentiators

  • Experience defining or operating to SLOs / SLIs
  • Prior migration from traditional NOC → SRE model
  • Infrastructure as Code experience (Terraform, Ansible, etc.)
  • Exposure to security, compliance, or regulated environments

 

Requisition ID: 10579.

Reporting into: Manager, Network Operations.

Role Type: Individual Contributor.

#LI-Remote

About NiCE

NICE Ltd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NiCE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions.

Known as an innovation powerhouse that excels in AI, cloud and digital, NiCE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries.

NiCE is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, age, sex, marital status, ancestry, neurotype, physical or mental disability, veteran status, gender identity, sexual orientation or any other category protected by law.


Skills Required

  • Strong Linux systems administration
  • Experience with incident management and production support
  • Familiarity with cloud infrastructure (AWS preferred)
  • Containers & orchestration (Docker, Kubernetes)
  • Scripting or programming experience in Python, Bash, Go, or similar
  • Understanding of networking fundamentals (DNS, TCP/IP, load balancing)
  • Experience working in 24x7 NOC or production operations environments
  • Ability to handle high-pressure incidents calmly and effectively
  • Strong written and verbal communication for incident coordination
  • Comfort working from runbooks--but improving them when they fall short

NICE Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about NICE and has not been reviewed or approved by NICE.

  • Healthcare Strength Benefits are described as broad and comprehensive, spanning medical, dental, vision, life, disability, and mental-health support. Added programs like FSA options and fitness stipends contribute to a well-rounded health and wellness offering.
  • Retirement Support A 401(k) is part of the package, sometimes paired with match details that are described as typical to stronger depending on role and time period. Employee stock participation is also positioned as an additional long-term wealth-building component for eligible roles.
  • Flexible Benefits Flexible work arrangements are emphasized, including hybrid setups and remote options for some roles. Flex scheduling, paid holidays, and paid sick time add to the perceived flexibility of the overall rewards package.

NICE Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Hoboken, NJ
10,130 Employees
Year Founded: 1986

What We Do

NICE (Nasdaq: NICE) is the worldwide leading provider of both cloud and on-premises enterprise software solutions that empower organizations to make smarter decisions based on advanced analytics of structured and unstructured data. NICE helps organizations of all sizes deliver better customer service, ensure compliance, combat fraud and safeguard citizens. Over 25,000 organizations in more than 150 countries, including over 85 of the Fortune 100 companies, are using NICE solutions. www.nice.com.

Similar Jobs

GitLab Logo GitLab

Site Reliability Engineer

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
United Kingdom
2500 Employees

Soros Fund Management Logo Soros Fund Management

Site Reliability Engineer

Fintech • Payments • Financial Services
Remote
United Kingdom
88 Employees
Remote or Hybrid
2 Locations
120 Employees

OneSignal Logo OneSignal

Staff Software Engineer

Mobile • Other • Software • Analytics
Remote
United Kingdom
102 Employees
100K-145K Annually

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account