Staff Site Reliability Engineer

Posted 16 Days Ago
Be an Early Applicant
Poland
Mid level
Cloud • Digital Media • Enterprise Web • Marketing Tech • Software
One app to replace them all. All your work in one place: Tasks, docs, chat, goals, & more.
The Role
The Staff Site Reliability Engineer at ClickUp will improve the stability and reliability of the organization's cloud-based infrastructure. Responsibilities include system design, troubleshooting, monitoring performance metrics, responding to outages, and collaborating with engineering teams to enhance site reliability practices.
Summary Generated by Built In

ClickUp is the world’s only all-in-one productivity platform that flexes to the way people want to work. It replaces all individual workplace productivity tools with a single, unified platform that includes project management, document collaboration, whiteboards, spreadsheets, and AI. Our dedication to enhancing productivity has earned us recognition on prestigious lists including the Forbes Cloud 100, Fast Company's Most Innovative Companies, Inc. Power Partners and #1 on two of G2's Best Software Products Lists for 2023 - #1 Project Management Product and #1 Collaboration and Productivity Product.  With our headquarters based in San Diego and a rapidly expanding global presence, we are shaping the future of work. Join our team at ClickUp, one of the fastest-growing SaaS companies worldwide, and help millions of users be more productive - saving them at least one day every week. 🦄

We are looking for driven and innovative software engineers with strong site reliability engineering (SRE) discipline or interest in this area to help us make ClickUp the "one app to rule them all". As an SRE at ClickUp, your primary roles will be improving the stability, availability and reliability of our globally distributed and cloud-based infrastructure that powers our app for thousands of users daily. If you are a rockstar engineer with an entrepreneurial and high-paced mindset who are ready to own, drive and tackle some of the most complex problems there are out there we would love to hear from you!

 

What you'll do:

  • Participate in designing and building systems for maximum performance, reliability, and scalability.
  • Work with the engineering teams on product design, decisions, and troubleshooting.
  • Increase general stability, observability, and metrics surrounding both uptime and stability.
  • Champion our monitoring infrastructure.
  • Implement and improve our general site reliability posture (error and downtime budgets, MTTD and MTTR improvements, improving alerting and notifications, minimizing customer impact from incidents, etc.)
  • Respond to and troubleshoot downtime events while actively developing safeguards to prevent them.
  • Participate in brainstorming sessions with the engineering team and contribute ideas to our technology and algorithms.

 

What we’re looking for:

  • 4-6+ years of knowledge of the Amazon Web Services ecosystem ( EC2, ECS, VPC, Redis, RDS, ALB, ECR),
  • Experience working with Kubernetes,
  • Experience in managing production-critical infrastructures and DevOps mindset.
  • Be familiar with SRE best practices and procedures.
  • Experience with IaC (CDK, Terraform), CI/CD (GitHub Actions, ArgoCD), 
  • Familiar with Containerisation (Docker),
  • Knowledgeable in network, firewall, and security best practices.
  • Experience with self-healing automation and monitoring tools (DataDog, CloudWatch)
  • Knowledge of relational databases, preferably PostgreSQL (not mandatory)
  • A strong self-starter, operationally-focused; a problem-solver.
  • Excellent interpersonal, written, and oral communication skills.
  • Experience with application security testing is a plus (not mandatory)
  • Familiarity or experience with Node.js is a plus (not mandatory).
  • Experience with management of Linux-based EC2 instances.

#LI-REMOTE



Unsure if you meet all the qualifications of this job description but are deeply excited about the role? We hire based on ambition, grit, and a passion for improving the way people work. If you think ClickUp is the company for you, we encourage you to apply!

ClickUp was founded on a culture of hard work, consistent growth, and a desire to break norms. We’re a values-driven company and hire based on ambition, merit, and a willingness to do what it takes to succeed. We don’t care where you’re from, what you look like, or who you’re in a relationship with—we hire the best people for the job, and create an environment that supports employees on their journey to do the most exciting work of their lives! ClickUp is an Equal Opportunity Employer, and qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin.

ClickUp collects and processes personal data in accordance with applicable data protection laws.

  • If you are a European Job Applicant, see our privacy policy for further details.
  • If you are a Philippine Job Applicant, see our privacy policy and our Philippine Data Privacy Notice for further details.

Please note we are unable to sponsor or take over sponsorship of an employment visa for roles outside of engineering and product at this time. Sponsorship for engineering and product roles is not guaranteed, but is instead based on the business needs for that specific role at that time. Please reach out to the recruiter with any questions.

Top Skills

Amazon Web Services
Ci/Cd
Docker
Kubernetes
Terraform
The Company
HQ: San Diego, CA
1,000 Employees
Hybrid Workplace
Year Founded: 2016

What We Do

ClickUp is one app to replace them all. It's more than just task management - ClickUp offers docs, reminders, goals, calendars, and even an inbox. Fully customizable, ClickUp works for every type of team, so all teams can use the same app to plan, organize, and collaborate! ClickUp is trusted by millions of users and over 100,00 teams.

Why Work With Us

ClickUppers are highly passionate, energetic, and unique people that align in the mission of saving people time and making the world more productive. We're the newcomer, the underdog, but that's where we thrive. Let’s make the world more productive, together!

Gallery

Gallery

Similar Jobs

Sanity.io Logo Sanity.io

Senior Site Reliability Engineer

Artificial Intelligence • Enterprise Web • Software
Remote
28 Locations
190 Employees

Cisco Meraki Logo Cisco Meraki

Senior Site Reliability Engineer, Scalability

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
Poland
3000 Employees

neptune.ai Logo neptune.ai

Staff Site Reliability Engineer

Machine Learning • Software
Remote
28 Locations
73 Employees

GitLab Logo GitLab

Intermediate Site Reliability Engineer, Database Operations

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
28 Locations
2050 Employees

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account