Site Reliability Engineer Lead

Posted 2 Days Ago
Be an Early Applicant
Houston, TX, USA
In-Office
Senior level
Other • Energy
The Role
Lead SRE practices for GCP-based data platforms, automate workflows, design reliable architectures, mentor engineers, and improve operational processes.
Summary Generated by Built In

Brief Description:

We are seeking an Site Reliability Engineer Lead to own and evolve the reliability, scalability, and operational excellence of cloud-native data platforms running primarily on Google Cloud Platform (GCP). This role supports data systems that ingest, process, and serve large volumes of operational data from oilfield and energy environments. The ideal candidate is a cloud-first SRE with deep GCP experience, strong Python engineering skills, and a track record of leading reliability initiatives for data-intensive systems.

 

Detailed Description:

  • Lead SRE practices for GCP-based data platforms

  • Design and own SLIs, SLOs, error budgets, and reliability metrics

  • Build and maintain cloud-native observability (monitoring, logging, alerting)

  • Lead incident response for production cloud systems and drive postmortems

  • Partner with data engineering and platform teams to design reliable architectures

  • Automate operational workflows using Python

  • Drive improvements in CI/CD, infrastructure as code, and deployment safety

  • Mentor engineers and set SRE best practices across the team

 

Required Knowledge, Skills, and Abilities:

  • 7+ years in SRE, Cloud Platform Engineering, or DevOps

  • Strong hands-on experience with Google Cloud Platform, including:

  • GCP: GKE, Compute Engine, Cloud Storage, Pub/Sub (or equivalents)

  • Cloud Monitoring & Logging

  • BigQuery

  • Dataflow

  • Datastream

  • IAM and networking

  • Composer/AIrflow

  • Kubernetes: deployment, scaling, reliability patterns

  • CI/CD: GitHub Actions, GitLab CI, or similar

  • Observability: GCP Cloud Monitoring, Logging

  • Experience supporting cloud-native data systems (batch and streaming)

  • Production experience with Python for automation, tooling, or services

  • Infrastructure as Code experience (Terraform strongly preferred)

  • Experience operating systems in 24/7 production environments

 

Minimum Qualifications:

  • Bachelor’s degree in Business, Information Technology, Computer Science, or a related field.

  • 5+ years experience in Site Reliability Engineering, Cloud Platform Engineering, or DevOps

  • 3+ years operating production workloads on Google Cloud Platform (GCP)

  • Prior technical leadership experience (lead engineer, tech lead, or ownership of reliability initiatives)

  • Ability to understand and speak English at a level of proficiency allowing employee to issue, receive and respond to both safety and operations-related directions in English

Preferred Qualifications:

  • Oil and Gas Industry knowledge

  • Technology/Digital Industry knowledge

About Us
The Evolving Oil Field Demands Evolving Service Providers

NexTier is a leading provider of integrated completions that employs sustainable practices and equipment to support our customers’ ESG goals while accelerating production in the most demanding US land basins.

Top Skills

BigQuery
Ci/Cd
Cloud Logging
Cloud Monitoring
Cloud Storage
Compute Engine
Dataflow
Datastream
Github Actions
Gitlab Ci
Gke
Google Cloud Platform
Iam
Kubernetes
Pub/Sub
Python
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Houston, TX
1,900 Employees
Year Founded: 1978

What We Do

Patterson-UTI Energy subsidiaries provide onshore contract drilling and pressure pumping services. Patterson-UTI Energy, Inc. pushes the boundaries of innovation so you can embrace new possibilities. With expertise and scale in major operational areas, we provide a diverse network of drilling and pressure pumping services, directional drilling, rental equipment and technology to forge your path to success. Our oilfield solutions deliver results that lead your business into the next generation of oil and gas. With headquarters in Houston, Texas and regional offices throughout our operating areas, let’s team up to advance your business.

Similar Jobs

Wells Fargo Logo Wells Fargo

Site Reliability Engineer

Fintech • Financial Services
Hybrid
5 Locations
205000 Employees
250K-300K Annually

Milestone Systems Logo Milestone Systems

Site Reliability Engineer

Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics
Remote or Hybrid
United States
1500 Employees
160K-180K Annually

MongoDB Logo MongoDB

Site Reliability Engineer

Big Data • Cloud • Software • Database
Easy Apply
Hybrid
8 Locations
5550 Employees
151K-297K Annually

Striveworks Logo Striveworks

Site Reliability Engineer

Artificial Intelligence • Big Data • Computer Vision • Information Technology • Machine Learning • Analytics • Defense
Easy Apply
Hybrid
Austin, TX, USA
67 Employees
110K-128K Annually

Similar Companies Hiring

Compa Thumbnail
Software • Other • HR Tech • Business Intelligence • Artificial Intelligence
Irvine, CA
70 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account