Senior Observability & Monitoring Engineer (SRE/DevOps)

Reposted 22 Days Ago
Be an Early Applicant
Petah Tikva
In-Office
Senior level
Security • Software
The Role
Lead reliability and observability efforts for critical applications, optimize monitoring strategies, automate tasks, and mentor engineers in best practices.
Summary Generated by Built In
Company Description

About CyberArk:
CyberArk (NASDAQ: CYBR), is the global leader in Identity Security. Centered on privileged access management, CyberArk provides the most comprehensive security offering for any identity – human or machine – across business applications, distributed workforces, hybrid cloud workloads and throughout the DevOps lifecycle. The world’s leading organizations trust CyberArk to help secure their most critical assets. To learn more about CyberArk, visit our CyberArk blogs or follow us on X, LinkedIn or Facebook.

Job Description

We are seeking an experienced Observability & Monitoring Engineer (SRE/DevOps) to lead reliability, observability, and performance efforts for our most critical applications. This role bridges development, operations, and product, ensuring our systems are robust, scalable, and drive superior business outcomes. The Senior Observability & Monitoring Engineer will design and optimize monitoring strategies, automate operational tasks, and serve as a technical mentor for reliability within the R&D organization.

Key Responsibilities:

  • Architect, implement, and maintain advanced monitoring, logging, and alerting solutions using Datadog (mandatory), covering infrastructure, application, and business-level metrics.
  • Lead and optimize reliability, performance, and scalability efforts for PostgreSQL, Redis, SQS, K8s, and cloud-native environments.
  • Design, build, and maintain automations for operational tasks, deployments, and remediations (Infrastructure-as-Code, CI/CD, self-healing workflows).
  • Mentor engineers on reliability engineering best practices, monitoring usage, and troubleshooting methodologies.
  • Lead knowledge sharing by producing high-quality documentation, technical presentations, and internal training.
  • Perform capacity planning, performance tuning, and proactively address potential bottlenecks or scaling issues.
  • Stay current with SRE, DevOps, and cloud trends; evaluate and recommend new tools and approaches for continuous improvement.

#LI-Hybrid

#LI-CR1

Qualifications

  • 7+ years of experience in SRE, DevOps, or production engineering roles supporting large-scale distributed systems.
  • Expertise architecting and operating monitoring, tracing, and alerting with Datadog (including custom metrics, dashboards, and advanced alerting techniques).
  • Experience with additional monitoring/observability platforms (e.g., Prometheus, Grafana, ELK stack).
  • Hands-on knowledge of PostgreSQL, Redis, SQS, and Kubernetes (deployment, troubleshooting, scaling, and performance optimization).
  • Advanced scripting/programming skills with Python, Bash, or another relevant language.
  • Track record of designing and implementing automated solutions (Infrastructure-as-Code, CI/CD pipelines, auto-remediation).
  • Strong communication skills, including technical writing, documentation, and presentation to diverse technical audiences.
  • Experience working closely with development, product, and architecture teams to embed reliability from the design phase.
  • Fluent technical English.

Preferred Qualifications:

  • Strong familiarity with SaaS, microservices architectures, and security best practices.
  • Cloud certifications (e.g., AWS Certified Solutions Architect, GCP Professional Cloud Engineer) are a plus.
  • Deep experience with chaos engineering, performance/load testing, and continuous improvement frameworks.
  • Demonstrated ability to mentor engineers, promote reliability culture, and foster knowledge sharing.

 

Top Skills

Bash
Datadog
Elk Stack
Grafana
Kubernetes
Postgres
Prometheus
Python
Redis
Sqs
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Hyderabad, Telangana
2,327 Employees

What We Do

CyberArk is the global leader in Identity Security. Centered on privileged access management, CyberArk provides the most comprehensive security offering for any identity – human or machine – across business applications, distributed workforces, hybrid cloud workloads and throughout the DevOps lifecycle. The world’s leading organizations trust CyberArk to help secure their most critical assets.

For over a decade CyberArk has led the market in securing enterprises against cyber attacks that take cover behind insider privileges and attack critical enterprise assets. Today, only CyberArk is delivering a new category of targeted security solutions that help leaders stop reacting to cyber threats and get ahead of them, preventing attack escalation before irreparable business harm is done. At a time when auditors and regulators are recognizing that privileged accounts are the fast track for cyber attacks and demanding stronger protection, CyberArk’s security solutions master high-stakes compliance and audit requirements while arming businesses to protect what matters most.

With offices and authorized partners worldwide, CyberArk is a vital security partner to more than 6,770 global businesses, including:

More than 50% of the Fortune 500
More than 35% of the Global 2000

CyberArk has offices in the U.S, Israel, U.K., Singapore, Australia, France, Germany, Italy, Japan, Netherlands and Turkey.

Similar Jobs

ServiceNow Logo ServiceNow

Security Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Petah Tikva, ISR
28000 Employees

ServiceNow Logo ServiceNow

Security Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Petah Tikva, ISR
28000 Employees

ServiceNow Logo ServiceNow

Staff Software Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Petah Tikva, ISR
28000 Employees

ServiceNow Logo ServiceNow

Sr Mgr, Software Engrg Mgmt

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Petah Tikva, ISR
27000 Employees

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account