Principal Site Reliability Engineer, Google Cloud

Posted 2 Days Ago
Be an Early Applicant
2 Locations
Hybrid
240K-250K Annually
Expert/Leader
Software
The Role
Define and drive reliability for Saviynt's SaaS platform by designing, building, and operating scalable, reusable platform services. Lead Kubernetes platform engineering, multi-region cloud architectures, event-driven systems, CI/CD pipelines, observability, service mesh, and shared relational data services. Provide tooling, APIs, on-call support, and cross-team guidance.
Summary Generated by Built In
Saviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization's applications, data, and business processes. Customers trust Saviynt to safeguard their digital assets, drive operational efficiency, and reduce compliance costs. Built for the AI age, Saviynt is today helping organizations safely accelerate their deployment and usage of AI. Saviynt is recognized as the leader in identity security, with solutions that protect and empower the world’s leading brands, Fortune 500 companies and government institutions. For more information, please visit www.saviynt.com.

Why This Role Matters

Saviynt’s platform is mission-critical for our customers. As we scale globally, reliability, availability, and performance are not optional—they are core product features.

As a Principal  Engineer, you will define and drive the reliability strategy for our SaaS platform. This is a high-impact, hands-on engineering role with broad influence across infrastructure, platform, and application teams. You will shape how Saviynt designs, operates, and measures reliability at scale.

This role is ideal for engineers who want to work on hard reliability problems, influence architecture across teams, and leave a lasting mark on a growing SaaS platform.

 

 

What You Will Do

    In this pivotal role, you will be instrumental in designing, building, and maintaining the shared infrastructure services and platforms that our product and application teams will depend on

    •  You will focus on creating reusable, reliable, and scalable solutions that abstract away complexity, enabling other teams to focus on their core business logic and deliver features faster in a multi-cloud environment

    •  Design and build core platform components and shared infrastructure services that other development teams will integrate with and leverage to deploy and operate their applications

    •  Architect, implement, and manage highly available and scalable Kubernetes platforms as a service for internal consumers

    •  Develop robust, internal-facing tools and automation for infrastructure provisioning and management primarily using Go (Golang)

    •  Architect and optimize foundational solutions within Cloud environments (AWS, Azure, etc.), focusing on creating reusable patterns and modules for other teams

    •  Design and implement shared Event-Driven Architecture components and messaging platforms using technologies like Kafka or Google Pub/Sub that product teams can easily utilize

    •  Develop and maintain robust CI/CD pipelines (e.g., GitLab CI and ArgoCD) as a service, providing standardized and automated deployment workflows for various development teams

    •  Design and build resilient Distributed Systems components that serve as building blocks for other applications, focusing on reliability, fault tolerance, and performance

    •  Manage and optimize our shared infrastructure across Multi-Region Cloud Environments, ensuring that platform services are globally available and performant for all consumers

    •  Establish and enhance centralized Observability and Monitoring platforms and tools that provide self-service insights for consuming teams

    •  Define and implement clear, well-documented RESTful API designs for the infrastructure services you build, ensuring ease of integration for internal clients

    •  Implement and manage Service Mesh (e.g., Envoy, Istio) capabilities, providing traffic management, security, and policy enforcement as a shared platform for services

    •  Design, implement, and optimize highly available Relational Database services or shared data platforms for broad organizational use

    •  Collaborate closely with product development teams to understand their infrastructure needs and pain points, providing technical guidance and support

    •  Participate in on-call rotations to support the critical shared infrastructure you build

What Are We Looking For

    • 9+ years of experience in an Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a strong focus on building tools and services for other engineers

    • Deep expertise with Kubernetes in production environments, particularly in providing it as a platform(i.e single tenant and multi-tenant deployment architectures)

    • Strong programming skills in Go (Golang) and Python, with experience building robust, maintainable backend services and automation

    • Extensive hands-on experience with at least one major Cloud Provider (GCP is a must); multi-cloud experience is a strong plus, especially in building abstractions over them. 

    • Proven experience designing and implementing Event-Driven Architecture and message queuing systems (e.g., Kafka, RMQ, NATS) as shared services

    • Solid understanding and practical experience with CI/CD pipeline tools (especially GitLab CI) and experience establishing automated delivery processes for other teams

    • Demonstrable experience designing and operating Distributed Systems, with an understanding of patterns for creating reliable, shared components

    • Familiarity with Multi-Region Cloud Environments and strategies for building globally distributed and highly available platform

    • Proficiency in establishing and utilizing comprehensive Observability and Monitoring platforms (e.g., Prometheus, Grafana, ELK stack, Datadog) for shared infrastructure

    • Strong experience with RESTful API design principles and building well-documented, consumable APIs

    • Knowledge of Service Mesh concepts and practical experience with solutions like Istio in a platform context

    • Hands-on experience with Relational Databases (e.g., MySQL, PostgresSQL), ideally in managing them as a service

    • Excellent communication skills and the ability to clearly articulate complex technical concepts to both technical and non-technical audiences

    • A strong customer-centric mindset, treating internal development teams as your primary customers

    • Advanced Professional GCP Certification is required.

    • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience required

If required for this role, you will:
- Complete security & privacy literacy and awareness training during onboarding and annually thereafter
- Review (initially and annually thereafter), understand, and adhere to Information Security/Privacy Policies and Procedures such as (but not limited to):

> Data Classification, Retention & Handling Policy
> Incident Response Policy/Procedures
> Business Continuity/Disaster Recovery Policy/Procedures
> Mobile Device Policy
> Account Management Policy
> Access Control Policy
> Personnel Security Policy
> Privacy Policy

Saviynt is an amazing place to work. We are a high-growth, Platform as a Service company focused on Identity Authority to power and protect the world at work. You will experience tremendous growth and learning opportunities through challenging yet rewarding work which directly impacts our customers, all within a welcoming and positive work environment. If you're resilient and enjoy working in a dynamic environment you belong with us!

Saviynt is an equal opportunity employer and we welcome everyone to our team.  All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.

Skills Required

  • 9+ years in Infrastructure Development, Platform Engineering, or Site Reliability Engineering
  • Deep expertise with Kubernetes in production, including single-tenant and multi-tenant architectures
  • Strong programming skills in Go (Golang) and Python for building backend services and automation
  • Extensive hands-on experience with at least one major Cloud Provider (GCP required); multi-cloud experience preferred
  • Advanced Professional GCP Certification
  • Proven experience designing and implementing Event-Driven Architecture and message queuing systems (e.g., Kafka, RMQ, NATS)
  • Experience developing and managing CI/CD pipelines and automation (especially GitLab CI and ArgoCD)
  • Demonstrable experience designing and operating distributed systems with reliability and fault tolerance patterns
  • Familiarity with Multi-Region Cloud Environments and strategies for globally distributed, highly available platforms
  • Proficiency with observability and monitoring platforms (Prometheus, Grafana, ELK stack, Datadog)
  • Experience designing and building well-documented RESTful APIs for internal consumption
  • Practical service mesh experience (e.g., Envoy, Istio) in a platform context
  • Hands-on experience managing relational databases as a service (MySQL, PostgreSQL)
  • Bachelor's degree in Computer Science, Engineering, or related field, or equivalent practical/military experience
  • Excellent communication skills and customer-centric mindset for working with internal development teams
  • Participate in on-call rotations to support shared infrastructure

Saviynt Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Saviynt and has not been reviewed or approved by Saviynt.

  • Leave & Time Off Breadth Time off is described as flexible, with policies including flexible time off and mentions of unlimited PTO. This breadth can make time away easier to take alongside company holidays.
  • Wellbeing & Lifestyle Benefits In‑office amenities such as catered food, drinks, and snacks, plus social events like birthday celebrations and team outings, are highlighted. These lifestyle perks add day‑to‑day convenience and connection.
  • Career-Linked Recognition & Rewards Employee recognition is emphasized, with programs to celebrate those who go above and beyond. Regular recognition activities are cited alongside team bonding initiatives.

Saviynt Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: El Segundo, CA
Year Founded: 2010

What We Do

Saviynt’s Enterprise Identity Cloud helps modern enterprises scale cloud initiatives and solve the toughest security and compliance challenges in record time. The company brings together identity governance (IGA), granular application access, cloud security, and privileged access to secure the entire business ecosystem and provide a frictionless user experience.

Similar Jobs

CrowdStrike Logo CrowdStrike

Technical Account Manager

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
USA
10000 Employees
86K-135K Annually
Hybrid
Atlanta, GA, USA
205000 Employees
Hybrid
Tucker, GA, USA
205000 Employees

Commerce Logo Commerce

Senior Software Engineer

Artificial Intelligence • Cloud • Consumer Web • eCommerce • Information Technology • Software
In-Office
2 Locations
1200 Employees
116K-195K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account