Kevala

Staff Site Reliability Engineer

Reposted 18 Days Ago

Hiring Remotely in San Francisco, CA, USA

In-Office or Remote

136K-180K Annually

Senior level

Big Data • Energy • Big Data Analytics

Decarbonizing global energy with comprehensive, transparent data solutions.

The Role

The Staff Site Reliability Engineer will lead in designing and maintaining cloud infrastructure on GCP, drive IaC strategy, manage Kubernetes operations, ensure security compliance, and mentor engineers.

Summary Generated by Built In

As a Staff Site Reliability Engineer, you will be a key technical leader responsible for the architecture, reliability, and security of our entire cloud infrastructure. You will drive technical direction, mentor engineers, and solve our most complex infrastructure challenges as a hands-on contributor.

You will lead the management of our Google Cloud Platform (GCP) environment, drive our Infrastructure as Code (IaC) strategy, and ensure our Kubernetes-based microservices are deployed seamlessly and securely. You will serve as the expert for scalability, observability, and building the robust, automated systems that power Kevala's continuous deployment pipeline.

The applicant must have current, unrestricted work authorization in the United States. This job is not eligible for visa sponsorship.

What you will be doing

Architect & Maintain: Design, build, and maintain our core cloud-native infrastructure on Google Cloud Platform (GCP) following established best practices.
Infrastructure as Code (IaC): Lead our IaC strategy, writing and reviewing high-quality Terraform to manage all cloud resources in a repeatable and version-controlled way.
Kubernetes Operation: Manage and scale our Google Kubernetes Engine (GKE) clusters, including configuration of ingress, and monitoring components.
Champion Security & Compliance: Integrate, implement, and audit security best practices across all infrastructure layers (GCP IAM, GKE policies, network security), ensuring regulatory compliance and leading incident response.
Database Reliability: Manage the provisioning, scaling, and reliability of our Postgres databases (e.g., Cloud SQL) and other data stores.
Observability: Build and refine our monitoring, tracing, logging, and alerting systems (e.g., OpenTelemetry, Grafana, Google Cloud's operations suite) to ensure high availability.
Mentorship and Design: Partner with engineering teams on scalable architecture design. Mentor other engineers on DevOps practices, cloud architecture, and security.

What you need to succeed

Experience: 8+ years in a SRE, DevOps, or Infrastructure Engineering role, with a proven track record of operating in a Staff or similar technical leadership capacity.
Leadership & Communication: Excellent communication skills with the ability to clearly articulate complex technical decisions, mentor team members, and drive projects to completion.
GCP Proficiency: Extensive hands-on experience designing and managing production environments in Google Cloud Platform.
Kubernetes (K8s) Expert: Advanced knowledge of Kubernetes and its ecosystem (GKE preferred), including cluster administration and deployment tooling (e.g., Helm).
Terraform/IaC: Extensive, production-level experience using Terraform to manage complex cloud environments.
Automation: Deep experience with automation tooling and scripting (e.g., Bash, Python, Go) to manage infrastructure and operations at scale.
Database Skills: Experience managing and scaling relational databases like Postgres in a production environment.
Security Implementation & Auditing: Practical experience designing, implementing, and auditing security controls for cloud infrastructure, networks, and applications (e.g., IAM, network security).

The compensation for this opportunity includes a base salary range of $ 136,000 - $ 180,000, plus equity (stock options). This is our target compensation range and is subject to multiple factors, including level, experience, and location. As you go through our interview process, our recruiter will work with you to identify a competitive base salary within the proposed range and combine it with an equity package that reflects your excitement about joining Kevala.

This is a fully remote role which can be located anywhere within the United States. Please note that actual salaries may vary based on factors including, but not limited to, education, experience, and location.

Skills Required

8+ years in a SRE, DevOps, or Infrastructure Engineering role
Extensive hands-on experience in Google Cloud Platform
Advanced knowledge of Kubernetes
Extensive experience using Terraform
Deep experience with automation tooling and scripting
Experience managing and scaling relational databases like Postgres
Practical experience designing and implementing security controls

View all jobs at Kevala

View Kevala Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: San Francisco, California

40 Employees

Year Founded: 2014

What We Do

At Kevala, we are on a mission to decarbonize the global energy economy using the most comprehensive data sets available. We are a group of ambitious intellectuals who embrace unconventional approaches to solving complex problems. We foster a culture where everyone is encouraged to collaborate, create, and support one another in our collective endeavors. As a fast-growing startup, we are looking for individuals who are passionate about the environment and excited to join in on our mission to make energy-related data meaningful, transparent, and broadly accessible.

Why Work With Us

Kevala is unique because it combines advanced analytics, technology, and energy expertise to help modernize the grid and accelerate clean energy adoption. Employees work on meaningful, high-impact challenges in a collaborative, mission-driven environment while helping shape a smarter, more sustainable energy future for communities everywhere today.