Site Reliability Engineer

Posted 6 Hours Ago
Be an Early Applicant
2 Locations
Hybrid
Mid level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Define your future at CrowdStrike.
The Role
As a Site Reliability Engineer, ensure the reliability, performance, and scalability of the NG-SIEM platform while implementing automation and optimization strategies, leading incident management, and collaborating with engineering teams.
Summary Generated by Built In

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. We work on large scale distributed systems, processing almost 3 trillion events per day and this traffic is growing daily. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.

About the Role:

Our mission is to make all of our customers' security-relevant data continuously available for automated detection and response, threat hunting, and other Falcon use cases. To enable this, the systems behind NG-SIEM are growing to accommodate >100 PB of event and action data ingested every day, up to 10 years of retention, and dozens of millions of queries per hour across large sections of the data stored.

As our new NG-SIEM Site Reliability Engineer, you'll be responsible for ensuring the reliability, performance, and scalability of our serverless platform that delivers this massive scale to customers and other Falcon modules. You'll work on improving system observability, automating operational tasks, optimizing resource utilization, and maintaining our stringent SLOs while balancing cost efficiency. This role requires deep technical expertise in distributed systems, cloud infrastructure, and a passion for operational excellence.

What You'll Do:

  • Ensure Platform Reliability: Own the availability, latency, performance, and efficiency of NG-SIEM platform services handling >100 PB/day of data ingestion and millions of queries per hour

  • Build Automation & Tooling: Design and implement automation solutions for deployment, monitoring, incident response, and capacity planning to reduce toil and improve operational efficiency

  • Monitor & Optimize: Develop comprehensive observability solutions using metrics, logs, and traces; proactively identify and resolve performance bottlenecks and reliability issues

  • Incident Management: Lead incident response efforts, conduct blameless post-mortems, and drive continuous improvement initiatives to prevent recurrence

  • Capacity Planning: Analyze system performance data and growth trends to forecast infrastructure needs and ensure the platform scales efficiently with customer demand

  • SLO/SLA Management: Define, measure, and maintain Service Level Objectives and error budgets; balance feature velocity with reliability requirements

  • Cost Optimization: Implement strategies to optimize cloud resource utilization and reduce operational costs while maintaining performance and reliability standards

  • Collaborate Cross-Functionally: Partner with engineering teams to improve system design for reliability, influence architectural decisions, and embed SRE best practices

  • On-Call Participation: Participate in on-call rotation to provide 24/7 support for critical production systems

  • Documentation: Create and maintain runbooks, operational procedures, and technical documentation to enable team scalability

What You'll Need:

  • Experience in Site Reliability Engineering, DevOps, or similar roles supporting large-scale distributed systems in production environments

  • Strong programming skills in at least one language (Go) for automation and tooling development

  • Deep cloud expertise with hands-on experience in at least one major cloud platform (AWS or GCP), including compute, storage, networking, and managed services

  • Distributed systems knowledge: Understanding of distributed system design patterns, consistency models, fault tolerance, and scalability principles

  • Infrastructure as Code: Proficiency with IaC tools (Terraform) and configuration management (Ansible, Chef, Puppet)

  • Container orchestration: Experience with Kubernetes, Docker, Podman and container-based deployment patterns

  • Observability expertise: Hands-on experience with monitoring and observability tools (Prometheus, Grafana)

  • CI/CD pipelines: Experience building and maintaining continuous integration and deployment pipelines

  • Incident management: Proven track record of managing high-severity incidents and implementing preventive measures

  • Data-driven approach: Ability to analyze system metrics and logs to identify trends, anomalies, and optimization opportunities

  • Communication skills: Excellent verbal and written communication abilities for remote collaboration across global teams

Bonus Points:

  • Massive scale experience: 3+ years owning systems handling over 1 trillion requests per day or more than 10 PB of data per day

  • Multi-cloud experience: Hands-on work with hybrid or multi-cloud environments

  • Database expertise: Deep knowledge of distributed databases, data lakes, or SIEM platforms (ClickHouse, Redis, MySQL)

  • Security background: Exposure to cybersecurity, threat intelligence, or security operations

  • Networking expertise: Advanced understanding of network protocols, load balancing, and CDN technologies

#LI-MB1

Benefits of Working at CrowdStrike:

  • Remote-friendly and flexible work culture

  • Market leader in compensation and equity awards

  • Comprehensive physical and mental wellness programs

  • Competitive vacation and holidays for recharge

  • Paid parental and adoption leaves

  • Professional development opportunities for all employees regardless of level or role

  • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections

  • Vibrant office culture with world class amenities

  • Great Place to Work Certified™ across the globe

CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program.

CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements.

If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at [email protected] for further assistance.

Top Skills

Ansible
AWS
Chef
Docker
GCP
Go
Grafana
Kubernetes
Podman
Prometheus
Puppet
Terraform

What the Team is Saying

Andrew C.
Lauren P.
Brian P.
Alexa Z.
Theo K.
Sara I.
Lam N.
Lauren B.
Adeeb C.
Kristan C.
Alena C.
Thaddeus M.
Alyssa J.
KT T.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Austin, TX
10,000 Employees
Year Founded: 2011

What We Do

CrowdStrike has redefined security with the world’s most advanced cloud-native platform that protects and enables the people, processes and technologies that drive modern enterprise. Tested and proven, the world's largest organizations trust CrowdStrike to stop breaches with unparalleled protection against the most sophisticated cyberattacks.

The CrowdStrike culture has been built upon our Core Values since the day we began. We are Fanatical About the Customer, Relentlessly Focused on Innovation and believe that our Limitless Passion drives Unlimited Potential for every CrowdStriker. As a purpose-built remote-first company, we believe cultivating a connected culture for every employee, no matter where they are in the world, is a key ingredient in building a high-performing, diverse team.

We don’t have a mission statement. We’re on a mission—to stop breaches. Ready to join a mission that matters?

Why Work With Us

We have a culture that celebrates achievement, encourages flexibility and innovation and thrives on teamwork. We all work towards a single mission: to stop breaches. This common goal drives a sense of community and connection among our people across the globe.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

CrowdStrike Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Flexible
HQAustin, TX
Singapore
Osaka
Aarhus, DK
Arlington, VA
Barcelona, ES
Bengaluru, IN
Brussels, BE
Bucharest, RO
Cheltenham, GB
Copenhagen, DK
Dubai, Dubai
Irvine, CA
Kirkland, WA
Minneapolis, MN
Mumbai, IN
New Delhi, IN
Pune, IN
Reading, GB
Riyadh, SA
Saint Louis, MO
Sunnyvale, CA
Sydney, Sydney
Tel Aviv-Yafo, IL
Tokyo, Japan
Learn more

Similar Jobs

CrowdStrike Logo CrowdStrike

Senior Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
2 Locations
10000 Employees

CrowdStrike Logo CrowdStrike

Sales Development Representative

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
Barcelona, Cataluña, ESP
10000 Employees

CrowdStrike Logo CrowdStrike

Sales Development Representative

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
Barcelona, Cataluña, ESP
10000 Employees

CrowdStrike Logo CrowdStrike

Account Executive

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
Barcelona, Cataluña, ESP
10000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account