Senior Site Reliability Engineer

Posted 15 Days Ago
Be an Early Applicant
Hyderabad, Telangana
In-Office
Senior level
Security • Software
The Role
As a Senior Site Reliability Engineer, ensure reliability and performance of cloud infrastructure, manage incidents, automate processes, and collaborate cross-functionally.
Summary Generated by Built In
Company Description

About CyberArk:
CyberArk (NASDAQ: CYBR), is the global leader in Identity Security. Centered on privileged access management, CyberArk provides the most comprehensive security offering for any identity – human or machine – across business applications, distributed workforces, hybrid cloud workloads and throughout the DevOps lifecycle. The world’s leading organizations trust CyberArk to help secure their most critical assets. To learn more about CyberArk, visit our CyberArk blogs or follow us on X, LinkedIn or Facebook.

Job Description

We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our team. As an SRE, you will play a pivotal role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure. You will collaborate closely with development, operations, and other teams to implement and maintain efficient and resilient systems.

We are the SRE Frontline Team of CyberArk. Our group ensures the health and performance of system and services is optimal using monitoring tools and dashboards. Our goal is to maintain a scalable, fault-tolerant, high-load, distributed system. We are searching for an outstanding SRE expert who is responsible for driving and improving the Incident Management processes and goals for Site Reliability teams, with a focus on triaging and ensuring the reliability, performance, and scalability of CyberArk’s SaaS services and underlying AWS infrastructure. This role involves a combination of technical expertise, documentation, and collaboration to meet the organization's reliability and availability goals. 

Responsibilities:

  • Incident Management, Monitoring and Alerting: Drive incident response processes and troubleshoot complex issues, ensuring timely resolution of outages. Establish monitoring, logging, and alerting best practices using tools like Datadog, Site24x7 etc
  • Reliability & Availability

Ensure high availability and performance of production systems (SLIs, SLOs, SLAs).

Design and implement strategies to reduce system downtime and improve resiliency.

  • Tooling and Automation: Build essential tooling to improve reliability of systems and automated remediation of issues.  
  • Be a part of the on-call rotation 365x24x7. 
  • SOP Documentation: Create and maintain documentation for infrastructure, processes, and incident management protocols.
  • Understanding of Infrastructure as Code (IaC) tools such as Terraform and Ansible to automate the provisioning, configuration, and deployment processes.
  • Cloud Platform Expertise: Hands-on with cloud services, such as EC2, S3, VPC, RDS, EKS, ECS, Cloudwatch, Cloudformation and more. AWS certification is a plus.
  • CI/CD Pipelines: Fair understanding of CI/CD pipelines using tools like Jenkins.
  • Monitoring and Alerting: Hands-on experience with monitoring and alerting tools like Site24x7, Datadog, CloudWatch, Grafana etc to proactively identify and resolve issues.
  • Performance Tuning: Continuously optimize system performance, identify bottlenecks, and implement strategies to improve scalability and efficiency.
  • Cost Optimization: Identify and implement strategies to reduce cloud costs while maintaining performance and reliability.
  • Security Best Practices: Adhere to security best practices and implement measures to protect infrastructure and data from vulnerabilities and threats.
  • Collaboration and Leadership: Work effectively with cross-functional teams to understand business requirements and provide technical guidance. Partner with developers, product managers, and security teams to design reliable services. Advocate SRE best practices, drive cultural adoption of reliability engineering. Mentor junior engineers and contribute to team growth.

#IL-MP01

    Qualifications

    Required Skills and Experience:

     

    • 5-8 years in SRE, DevOps, or cloud infrastructure roles.
    • Strong expertise in cloud platforms (AWS, GCP, or Azure).
    • Deep understanding and hands-on experience of AWS cloud services like EC2, S3, VPC, RDS, EKS, ECS, CloudFormation and more. AWS Certification is a plus.
    • Good Logical, Analytical and Problem-solving skills. 
    • Strong communication skills and Ability to work in shifts (24x7).
    • Strong scripting skills (Python, PowerShell, CDK, Shell scripting).
    • Understanding of infrastructure as code tools (Terraform, Ansible) and AWX Tower
    • Knowledge of containerization (Docker) and orchestration platforms (Kubernetes).
    • Expertise in CI/CD pipelines and automation tools (Jenkins, GitHub).
    • Exposure to monitoring and observability tools (CloudWatch, Datadog, ELK, Grafana, Site24x7).
    • Documenting SOP and RCAs.
    • Understanding of security best practices and compliance standards. Security Certification is a plus.

    Top Skills

    Ansible
    AWS
    Bash
    Cloudwatch
    Datadog
    Docker
    Elk
    Grafana
    Jenkins
    Kubernetes
    Powershell
    Python
    Site24X7
    Terraform
    Am I A Good Fit?
    beta
    Get Personalized Job Insights.
    Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

    The Company
    Hyderabad, Telangana
    2,327 Employees

    What We Do

    CyberArk is the global leader in Identity Security. Centered on privileged access management, CyberArk provides the most comprehensive security offering for any identity – human or machine – across business applications, distributed workforces, hybrid cloud workloads and throughout the DevOps lifecycle. The world’s leading organizations trust CyberArk to help secure their most critical assets.

    For over a decade CyberArk has led the market in securing enterprises against cyber attacks that take cover behind insider privileges and attack critical enterprise assets. Today, only CyberArk is delivering a new category of targeted security solutions that help leaders stop reacting to cyber threats and get ahead of them, preventing attack escalation before irreparable business harm is done. At a time when auditors and regulators are recognizing that privileged accounts are the fast track for cyber attacks and demanding stronger protection, CyberArk’s security solutions master high-stakes compliance and audit requirements while arming businesses to protect what matters most.

    With offices and authorized partners worldwide, CyberArk is a vital security partner to more than 6,770 global businesses, including:

    More than 50% of the Fortune 500
    More than 35% of the Global 2000

    CyberArk has offices in the U.S, Israel, U.K., Singapore, Australia, France, Germany, Italy, Japan, Netherlands and Turkey.

    Similar Jobs

    DigitalOcean Logo DigitalOcean

    Senior Site Reliability Engineer

    Cloud • Enterprise Web • Software • Infrastructure as a Service (IaaS)
    Hybrid
    Hyderabad, Telangana, IND

    Zscaler Logo Zscaler

    Senior Site Reliability Engineer

    Cloud • Information Technology • Security • Software • Cybersecurity
    Hybrid
    Hyderabad, Telangana, IND

    NVIDIA Logo NVIDIA

    Senior Site Reliability Engineer

    Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
    In-Office or Remote
    6 Locations

    Aqua Security Logo Aqua Security

    Senior Site Reliability Engineer

    Cloud • Information Technology • Security • Software • Cybersecurity
    In-Office
    Hyderabad, Telangana, IND

    Similar Companies Hiring

    Credal.ai Thumbnail
    Software • Security • Productivity • Machine Learning • Artificial Intelligence
    Brooklyn, NY
    Standard Template Labs Thumbnail
    Software • Information Technology • Artificial Intelligence
    New York, NY
    10 Employees
    PRIMA Thumbnail
    Travel • Software • Marketing Tech • Hospitality • eCommerce
    US
    15 Employees

    Sign up now Access later

    Create Free Account

    Please log in or sign up to report this job.

    Create Free Account