Engineering Manager - Chaos Platform

Job Posted 2 Days Ago Posted 2 Days Ago
Be an Early Applicant
2 Locations
Hybrid
187K-240K Annually
Mid level
Artificial Intelligence • Cloud • Software • Cybersecurity
We are building the monitoring and security platform for developers, IT ops teams and business users in the cloud age.
The Role
As an Engineering Manager, lead a team enhancing the Chaos Platform, ensure collaboration for system resilience, and mentor engineers in Chaos Engineering practices.
Summary Generated by Built In

Chaos Platform is an SRE team in our Resilience Engineering organization whose mission is to  enable engineering teams at Datadog to improve and maintain the resilience of their services. We offer engineers a library of different failure scenarios that they can use to verify if their systems operate as expected by embracing a mindset of experimentation and the practices of Chaos Engineering. Being able to simulate realistic failures in a complex ecosystem like Datadog, while limiting the blast radius in a safe way, requires our engineers to be strong software engineers with a good awareness of how distributed systems can break at scale.

As an Engineering Manager, you will help us realize this mission of enabling Chaos Engineering by continuing to build out our Chaos Platform and scaling the team by hiring and growing our engineering talent. You will play an instrumental role in helping us build more adoption and impact on company-wide projects, and actively drive collaboration between the team and key stakeholders.

At Datadog, we place value in our office culture - the relationships that it builds, the creativity it brings to the table, and the collaboration of being together. We operate as a hybrid workplace to ensure our employees can create a work-life harmony that best fits them.

What You’ll Do

  • Lead and mentor a team of experienced SWEs around the globe who are passionate about building a culture of reliability at Datadog. Help engineers grow to the next level and continuously provide them opportunities to develop. 
  • Build exciting new features for our self-service Chaos Engineering platform to solve real user problems and ensure our platform remains relevant and well-integrated within the wider Datadog ecosystem.
  • Advocate for and build out Chaos Engineering as a practice by focusing the team’s efforts on empowering engineers at Datadog to verify their resilience to failures, including in production.
  • Work with stakeholders across Datadog to build consensus on shared topics around resilience – for example collaborating with infrastructure teams to ensure we have consistent graceful degradation and disaster recovery solutions, or security teams to ensure adversary testing works for engineers.
  • Build a culture of continuous learning by incentivizing engineers to invest in chaos experiments, team-level game days, and ways to proactively discover failures before they happen in production. 

Who You Are

  • 2-3 years experience as a people manager or as a technical leader with strong mentorship skills. Ideally candidates will have experience in career development, performance management, tracking and optimizing team velocity, sprint planning, OKRs, and hiring.
  • 2-3 years experience in SRE, Resilience, Chaos Engineering or any domain that adopts a mindset of proactively breaking software in order to learn. Although we create impact through our self-service platforms, our ultimate goal is to enable the company to build resilience.
  • Technical pragmatism and an ability to help the team reason about trade-offs around implementation. You will often review the decisions and RFCs from senior engineers, and you will need to blend both your technical and business acumen to do this. 
  • Strong distributed systems knowledge, especially around Kubernetes and how controllers are designed, implemented and operated. We collaborate a lot with our infrastructure partners, so a solid understanding of Linux internals (e.g. control groups, namespaces and networking stack) will serve you well leading a technical team.
  • Strong stakeholder management skills. This will involve using your empathy, collaboration, and communication skills in English to work remotely with people across teams. Because our programs touch lots of different parts of the business, we frequently collaborate with stakeholders across Datadog and need to motivate teams to work together towards a shared goal.

Datadog offers a competitive salary and equity package, and may include variable compensation. Actual compensation is based on factors such as the candidate's skills, qualifications, and experience. In addition, Datadog offers a wide range of best in class, comprehensive and inclusive employee benefits for this role including healthcare, dental, parental planning, and mental health benefits, a 401(k) plan and match, paid time off, fitness reimbursements, and a discounted employee stock purchase plan.

The reasonably estimated yearly salary for this role at Datadog is:

$187,000$240,000 USD

About Datadog: 

Datadog (NASDAQ: DDOG) is a global SaaS business, delivering a rare combination of growth and profitability. We are on a mission to break down silos and solve complexity in the cloud age by enabling digital transformation, cloud migration, and infrastructure monitoring of our customers’ entire technology stacks. Built by engineers, for engineers, Datadog is used by organizations of all sizes across a wide range of industries. Together, we champion professional development, diversity of thought, innovation, and work excellence to empower continuous growth. Join the pack and become part of a collaborative, pragmatic, and thoughtful people-first community where we solve tough problems, take smart risks, and celebrate one another. Learn more about #DatadogLife on Instagram, LinkedIn, and Datadog Learning Center.

Equal Opportunity at Datadog:

Datadog is an Affirmative Action and Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. Here are our Candidate Legal Notices for your reference.

Your Privacy:

Any information you submit to Datadog as part of your application will be processed in accordance with Datadog’s Applicant and Candidate Privacy Notice.

Top Skills

Distributed Systems
Kubernetes
Linux

What the Team is Saying

Person1
Kyvaune
Sales Engineering
“Working at Datadog has afforded me the opportunity to consistently push myself out of my comfort zone, while feeling fully supported by amazing colleagues. In my short time here, I’ve grown tremendously as an individual, professional & technologist!“
Kyvaune
Josh
Darcy
Mia
Mike
LJ
Stephanie
Ian
Maura
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
5,000 Employees
Hybrid Workplace
Year Founded: 2010

What We Do

Datadog (NASDAQ: DDOG) is a global SaaS business, delivering a rare combination of growth and profitability. We are on a mission to break down silos and solve complexity in the cloud age by enabling digital transformation, cloud migration, and infrastructure monitoring of our customers' entire technology stacks. Built by engineers, for engineers, Datadog is used by organizations of all sizes across a wide range of industries. Together, we champion professional development, diversity of thought, innovation, and work excellence to empower continuous growth. Join the pack and become part of a collaborative, pragmatic, and thoughtful people-first community where we solve tough problems, take smart risks, and celebrate one another.

Why Work With Us

At Datadog, we learn from and celebrate each other daily - each win is a team win. Datadogs solve tough problems, innovate pragmatically, and grow together. We promote from within, provide mentorship and opportunities for career development, and support our colleagues in the process. Best of all? We truly love what we do.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

Datadog Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

We operate as a hybrid workplace to ensure our Datadogs can create a work-life harmony that best fits them and their team.

Typical time on-site: 3 days a week
HQNew York, NY
SG
New South Wales
Amsterdam, NL
Boston, MA
Denver, CO
Dublin, IE
Hanyang, KR
Lisbon, PT
Madrid, ES
Paris, FR
San Francisco, CA
Tokyo, JP
Learn more

Similar Jobs

Datadog Logo Datadog

Staff Software Engineer, Security Research

Artificial Intelligence • Cloud • Software • Cybersecurity
Hybrid
New York, NY, USA
5000 Employees
235K-300K Annually

Datadog Logo Datadog

Staff Software Engineer - New Workloads

Artificial Intelligence • Cloud • Software • Cybersecurity
Hybrid
New York, NY, USA
5000 Employees
234K-300K Annually

Datadog Logo Datadog

Staff Software Engineer - Action Platform

Artificial Intelligence • Cloud • Software • Cybersecurity
Hybrid
New York, NY, USA
5000 Employees
234K-300K Annually

Datadog Logo Datadog

Software Engineer - Distributed Storage

Artificial Intelligence • Cloud • Software • Cybersecurity
Remote
Hybrid
36 Locations
5000 Employees
130K-300K Annually
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account