Plume Design, Inc

Staff Site Reliability Engineer

Sorry, this job was removed at 04:08 p.m. (CST) on Monday, Sep 08, 2025

Be an Early Applicant

Palo Alto, CA

In-Office

Big Data • Internet of Things • Machine Learning

The Role

Life at Plume

At Plume, we believe that technology isn't about moving faster, it's about making life’s moments better. Which is why we’ve built the world's first, and only, open and hardware-independent service delivery platform for smart homes, small businesses, enterprises, and beyond. Our SaaS platform uses WiFi, advanced AI, and machine learning to create the future of connected spaces—and human experiences—at massive scale.

We now deliver services to over 60 million locations globally and have managed over 3 billion devices on our platform. We’re expanding rapidly, pioneering a new category, and we achieved our Series F funding in just four years. Our customers include many of the world's largest Internet Service Providers (ISPs) who look to Plume to help them evolve their smart home offerings while gleaning insights from their own data.

With a bias for action and a love for being trailblazers, the team at Plume embodies a combination of relentless curiosity and imaginative innovation. We challenge ourselves to think in ways that other companies don't, work to do what should be done (rather than what can), and if we can’t do it exceptionally well, we don’t do it. It’s how we've assembled a team of world-class builders, thinkers, and doers. And it’s how we’re reinventing what’s possible every day.

We’re looking for a seasoned Site Reliability Engineer, experienced with Customer Facing environments, to provide Technical Leadership for our Site Reliability Engineering Team. This team is focused on deployments, Production Infrastructure, Availability and Reliability. The right candidate has held several Infrastructure-oriented roles and needs to have strong technical knowledge in the DevOps/SRE technology stack while focusing on customer satisfaction.

Responsibilities:

Lead a team of Site Reliability Engineers who provide first-line support to Customer Clouds. Deployments, On-call, Application Provisioning are some of the routine tasks.
Run stand ups for the team, ticket management
Participate in the Sprints and close tickets with the team
Attend and conduct customer Meetings for Project and Roadmap specification.
Be able to step in and execute or triage issues. Some examples are as follows:

Provision and scale Kubernetes Infrastructure and Applications (EKS)
Deploy Software in multiple Production Environments
Own monitoring and alerting to production systems, improvements and changes
Contribute improvements to the current automation
Contribute improvements to our on-call process and alerting

Qualifications:

10+ Years of experience with Production Troubleshooting
Experience leading or mentoring others
Executive Communication skills
Bachelor’s degree in related field or equivalent experience, Advanced degree preferred.
Technical knowledge and working experience with:

Kubernetes (operate)
Basic Terraform Knowledge
Experience Programming/Scripting - one of the following (eg. Perl, Python, PHP, GoLang, Java, etc)
Experience with modern cloud infrastructure, preferably AWS
Experience with modern Linux Operating systems (Enterprise Linux or Debian based)
Experience both setting up and utilizing self-managed Monitoring and observability tools (e.g. Nagios/Icinga, Grafana, Prometheus)

Differentiators:

Troubleshooting production performance/service degradation or outage issues at scale
Experience with Infrastructure Troubleshooting in VMs and/or Bare Metal (ssh/Linux)
Advanced Kubernetes knowledge
Advanced Terraform knowledge
Customer Facing experience in previous roles
Experience operating Kafka in Production
Experience operating NoSQL Databases in Production
Experience operating Relational Databases in Production
Configuration Management experience

Kindly note that this is a HYBRID position, with a requirement to work in the office 3 days a week. We’re looking for candidates who are within a commutable distance. At this time, we are unable to provide relocation assistance.

Total Compensation package would include: anticipated compensation range of $177,000 - $208,000 + bonus + equity + benefits. Benefits include: a 401k plan and a company match, basic life insurance plus unparalleled health, dental, vision and other benefits and perks. For more details please see: https://www.plume.com/careers

An employee’s base salary and its position within the range may depend on a number of factors including job related knowledge, education, skills, experience and other business related considerations. Published ranges are provided in good faith at the time of posting.

About Plume

As the creator of the only open, hardware-independent, cloud-controlled experience platform for ISPs and their subscribers, Plume partners with over 400 ISP customers, including some of the world’s largest such as Comcast, Charter, Liberty Global, and J:COM.

Using OpenSync, the most widely supported open-source, silicon-to-cloud framework for smart spaces, Plume’s software-defined network allows ISPs to decouple their service offerings from hardware and rapidly curate and deliver new services over a multi-vendor, open-platform architecture.

Plume is an equal opportunity workplace that maintains a continuing policy of nondiscrimination in all employment practices and decisions, ensuring equal employment opportunities for all qualified individuals without regard to race, color, creed, religion, sex, national origin, age, physical or mental disability, sexual orientation, gender identity, marital status, pregnancy, childbirth or related individual conditions, medical conditions (as defined by state law), military or veteran status, or any other characteristic protected by federal, state or local law.

View all jobs at Plume Design, Inc

View Plume Design, Inc Profile

Report Job

Similar Jobs

WEX Inc.

Site Reliability Engineer

Fintech • Payments

In-Office or Remote

4900 Employees

150K-199K Annually

Verkada Inc

Enterprise Account Executive

Cloud • Hardware • Security • Software

In-Office or Remote

2000 Employees

220K-280K Annually

Square

Account Executive

eCommerce • Fintech • Hardware • Payments • Software • Financial Services

Remote or Hybrid

12000 Employees

130K-234K Annually

Cash App

Operations Manager

Blockchain • Fintech • Mobile • Payments • Software • Financial Services

Remote or Hybrid

3500 Employees

153K-270K Annually

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Palo Alto, CA

611 Employees

Year Founded: 2015

What We Do

The world’s first SaaS experience platform for Communications Service Providers.

At Plume, we believe that technology isn't about moving faster. It's about making moments better. Which is why we've brought relentless focus to understanding the digital lifestyles people want to live, the spaces where they play out, and innovating ways to make digital experiences blossom. As the only open and hardware-independent solution, Plume enables the rapid delivery of new services for connected homes, small business, and beyond at massive scale. For residential subscribers, Plume delivers self-optimizing WiFi, cyber-security, access controls, and more. Our purpose-built suite of smart services turns small business networks into fully connected, business intelligence platforms. And Service Providers get robust back-end applications for unprecedented visibility and support.

Our goal is simple: to keep you in the flow with any experiences you turn on. And to help you fill the spaces that matter to you with all kinds of wonderful.

With a bias for action and love for breaking molds, the team at Plume embodies a combination of relentless curiosity and imaginative innovation. We constantly challenge ourselves to think in ways that other companies don't, and work to do what should be done (rather than what can). It’s how we've assembled a team of world-class engineers, thinkers, and doers. And it’s how we’re reinventing what’s possible every day.

We believe that the core competitive advantage in the over-the-top era resides in the ability to create new services at high cadence, deploy them at massive scale, and to orchestrate through a common data set. Our flexible, cloud-controlled platform paired with an open-source software stack decouples service creation and delivery from dependence on proprietary hardware.