Engineering Manager, Site Reliability (SRE)

Posted 4 Days Ago
Hiring Remotely in United States
Remote
160K-200K Annually
Senior level
Information Technology • Security • Cybersecurity
The Role
Lead a Site Reliability Engineering team to ensure product reliability, oversee incident management, and collaborate with other engineering teams on performance issues.
Summary Generated by Built In
About Us

At SentinelOne, we’re redefining cybersecurity by pushing the limits of what’s possible—leveraging AI-powered, data-driven innovation to stay ahead of tomorrow’s threats.

From building industry-leading products to cultivating an exceptional company culture, our core values guide everything we do. We’re looking for passionate individuals who thrive in collaborative environments and are eager to drive impact. If you’re excited about solving complex challenges in bold, innovative ways, we’d love to connect with you.

What are we looking for?

Please note that under Federal & FedRAMP regulations, hiring for this role is limited to US citizens only. 

FedRAMP Staff may be subject to customer or third-party background checks up to and including secret clearance if required by their role at SentinelOne.

We are seeking an experienced engineering and operational Manager to lead a Site Reliability Engineering (SRE) team at SentinelOne. As the Manager of SRE, you will manage a team of SRE professionals responsible for ensuring the reliability and scalability of our products and production services, focusing on the experience our customers have in production every day. You will work closely with other engineering teams to identify and address availability, performance, and capacity issues, and you’ll be a key partner for our externally facing teams including Support, Customer Success, and Sales Engineering. This is a highly visible role within S1 with frequent executive communication opportunities, and is a great opportunity to do good work with good people all around the world.

As a team we value:

  • Thinking from first principles, understanding second order impacts 
  • Curiosity to understand new systems, their operating principles and limitations 
  • Strong operational ownership and a desire to reduce toil via automation
  • A drive to learn, especially from prior failures
  • Courage to take risks and make things happen 
  • Empathy and humility to collaborate effectively with peers and across teams
What will you do?
  • Grow and lead a team of SRE professionals, including setting performance goals and measuring deliverables against key metrics, while evolving those metrics as S1 grows and needs develop
  • Invest in data-driven deep triage on recurring issues, collaborating with other engineering teams to identify and address issues related to reliability, performance, and capacity
  • Develop, improve, and implement processes for the full incident lifecycle, including incident management, post-incident analysis, and learning from incidents. Lead incident response efforts, including coordinating with other teams to investigate and resolve customer-impacting incidents 
  • Design support model for SRE regarding service maturity and service ownership, including monitoring and alerting improvements, and SLI / SLO design and implementation
  • Analyze production metrics and signals to identify areas for improvement and take proactive steps to mitigate issues
  • Develop and implement best practices and standards for Site Reliability Engineering, from day-to-day operations to hiring and planning 
  • Communicate effectively with cross-functional teams to ensure alignment on objectives and priorities. Deliver outcomes, not just stories and tasks. 
What skills and knowledge should you bring?
  • 8+ years of related engineering experience, with at least 2 years in a management role
  • Demonstrated experience leading technical and operational teams at various stages of maturity 
  • Excellent analytical and problem-solving skills
  • Familiarity with modern software development methodologies, tools, and techniques, including CI/CD
  • Experience working with cloud-native applications and large-scale distributed systems, including a working knowledge of technologies such as Kubernetes and Terraform/IaC, and cloud providers such as AWS or GCP
  • Experience with various monitoring and alerting techniques and tools, including frameworks and concepts such as SLOs, OTel and Golden Signals as well as tooling such as Prometheus and Grafana 
  • Extensive experience with incident response and management at various layers of the stack across different business needs and applications, including both hands-on experience leading incidents/post-incident analysis and experience driving broader incident management initiatives 
  • Ability to thrive in a fast-paced, dynamic environment
Why us?

You will be joining a cutting-edge company where you will tackle extraordinary challenges and work with the very best in the industry.

  • Medical, Vision, Dental, 401(k), Commuter, Health and Dependent FSA
  • Unlimited PTO
  • Industry-leading gender-neutral parental leave
  • Paid Company Holidays
  • Paid Sick Time
  • Employee stock purchase program
  • Disability and life insurance
  • Employee assistance program
  • Gym membership reimbursement
  • Cell phone reimbursement
  • Numerous company-sponsored events, including regular happy hours and team-building events

This U.S. role has a base pay range that will vary based on the location of the candidate. For some locations, a different pay range may apply.  If so, this range will be provided to you during the recruiting process. You can also reach out to the recruiter with any questions.

Base Salary Range
$160,000$200,000 USD

SentinelOne is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

SentinelOne participates in the E-Verify Program for all U.S. based roles. 

Top Skills

AWS
Ci/Cd
GCP
Grafana
Kubernetes
Prometheus
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Mountain View, CA
2,830 Employees
Year Founded: 2013

What We Do

SentinelOne is a leading provider of autonomous security solutions for endpoint, cloud, and identity environments. Founded in 2013 by a team of cybersecurity and defense experts, SentinelOne revolutionized endpoint protection with a new, AI-powered approach. Our platform unifies prevention, detection, response, remediation, and forensics in a single, easy-to-use solution.
Our endpoint security product is designed to protect your organization's endpoints from known and unknown threats, including malware, ransomware, and APTs. It uses artificial intelligence to continuously learn and adapt to new threats, providing real-time protection and automated response capabilities.

SentinelOne's approach to security is designed to help organizations secure their assets with speed and simplicity. We provide the ability to detect malicious behavior across multiple vectors, rapidly eliminate threats with fully-automated integrated response, and adapt their defenses against the most advanced cyberattacks.

We are recognized by Gartner in the Endpoint Protection Magic Quadrant as a Leader and have enterprise customers worldwide. Our customers include some of the world's largest companies in various industries such as finance, healthcare, government, and more.

At SentinelOne, we understand that cybersecurity is a constantly evolving field and that the threats facing organizations are becoming increasingly sophisticated. That's why we are committed to staying at the forefront of technology and innovation and providing our customers with the best protection against cyber threats.

We offer our customers a wide range of services, including threat hunting, incident response, and incident management. Our team of experts is available to assist you 24/7 and can help you respond to and manage cyber incidents quickly and effectively.

To learn more about our products and services, please visit our website at www.sentinelone.com or contact us to schedule a demo

Gallery

Gallery

Similar Jobs

Easy Apply
Remote
USA
8355 Employees
132K-208K Annually

CrowdStrike Logo CrowdStrike

Senior Application Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
CA, USA
10000 Employees

Cox Enterprises Logo Cox Enterprises

Search Engine Optimization Specialist

Automotive • Cloud • Greentech • Information Technology • Other • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
21-32 Hourly

Cox Enterprises Logo Cox Enterprises

Director, Vendor Performance Management (Cox Automotive Fleet Client Solutions and Delivery)

Automotive • Cloud • Greentech • Information Technology • Other • Software • Cybersecurity
Remote or Hybrid
OK, USA
50000 Employees
132K-219K Annually

Similar Companies Hiring

Scrunch AI Thumbnail
Software • SEO • Marketing Tech • Information Technology • Artificial Intelligence
Salt Lake City, Utah
Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account