DevOps Engineer (Resiliency)

Posted 14 Days Ago
Be an Early Applicant
Tel Aviv-Yafo
Hybrid
3-5 Years Experience
Productivity • Software
monday.com makes work click
The Role
Develop and maintain disaster recovery and high availability strategies for infrastructure, design and implement multi-region architectures, create high availability strategies, monitor infrastructure capacity, and collaborate with engineering teams.
Summary Generated by Built In

monday.com is looking for a Resilience Engineer to join our newly formed Resilience Engineering team. You will be responsible for developing and maintaining disaster recovery and high availability strategies that ensure the resilience of our infrastructure.


About The Role:

  • Develop and maintain comprehensive disaster recovery (DR) plans to ensure rapid recovery from system failures and incidents.
  • Design and implement multi-region architectures to enhance system reliability and ensure high availability across all critical services.
  • Create and manage high availability (HA) strategies to minimize downtime and ensure continuous service delivery.
  • Monitor and plan infrastructure capacity to support resilience initiatives and ensure scalability during failures.
  • Work closely with other engineering teams to integrate resilience solutions into the overall infrastructure.

Our Stack: AWS, Terraform, Kubernetes, Datadog, OpenTelemetry


Social Title:

DevOps Engineer


Our Team:

The R&D Team is passionate about building innovative and lovable products, while tackling complex engineering problems at a great scale. We’re accountable for bringing the company’s vision to life by navigating our progress into flawless execution and encouraging full ownership and independence in all projects. The Infra role is a crucial piece as our company scales and user-base grows, conquering all aspects of product and infrastructure challenges. We are focused around development flow productivity, building application infrastructure and production resilience. We have huge challenges related to hyper growth of engineering, application and data scale.

Requirements

  • Strong experience in disaster recovery planning and high availability architecture.
  • Proven ability to design and implement multi-region architectures.
  • Familiarity with cloud services and infrastructure as code (IaC) practices.
  • Strong collaboration and communication skills.
  • Ability to work in a fast-paced, dynamic environment focused on resilience and reliability.

Top Skills

AWS
Datadog
Kubernetes
Opentelemetry
Terraform

What the Team is Saying

Matthew Burns
Nate
Ruchita
Dipro
Nate
Kyle
The Company
HQ: New York, NY
1,500 Employees
Hybrid Workplace
Year Founded: 2012

What We Do

monday.com is a work operating system that transforms the way teams work together. We’ve created a solution that connects people to workplace processes promoting a culture of transparency & empowerment. We're obsessed with building an excellent product. Our goal is to create a work operating system that people will love to use—one that’s fast, beautiful & responsive.

Why Work With Us

At monday.com we believe in transparency, accountability, and impact. Together, those values have lent themselves to create a strong culture of professional and creative autonomy where every team member is encouraged to share ideas and help bring them to life!

Gallery

Gallery

monday.com Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

monday.com embraces a flexible work environment with our hybrid model!

Typical time on-site: 3 days a week
HQNew York, NY
Chicago, IL
Denver, CO
London, GB
Melbourne, VIC
Miami, FL
São Paulo, BR
Sydney, NSW
Tel Aviv-Yafo, IL
Warsaw, PL
Learn more

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account