Staff Site Reliability Engineer

Reposted 25 Days Ago
Be an Early Applicant
Bengaluru, Bengaluru Urban, Karnataka, IND
In-Office
Senior level
Cloud
The Role
The role involves managing a SaaS platform, driving improvements in SDLC, automating processes, and supporting a global on-call system while leading and mentoring peers.
Summary Generated by Built In

Get to know Okta
Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth.
At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we’re looking for lifelong learners and people who can make us better with their unique experiences. 
Join our team! We’re building a world where Identity belongs to you.

What You’ll Be Doing
  • Design, build, and operate highly scalable, reliable, and secure infrastructure powering our production systems across AWS and GCP.
  • Lead major reliability and modernization initiatives, including container platform migrations (e.g., ECS to EKS/GKE) and microservice enablement across multi-cloud environments.
  • Serve as a technical authority in Kubernetes (EKS and GKE), cloud infrastructure (AWS and GCP), and modern CI/CD practices (GitOps, automation pipelines).
  • Partner with development teams to architect and enable microservice-based applications, ensuring production readiness, scalability, and observability.
  • Implement and manage infrastructure as code (Terraform, Ansible) to automate provisioning, scaling, and configuration management across multiple cloud providers.
  • Drive improvements in observability, performance, and cost efficiency through robust monitoring, logging, and alerting systems that span AWS and GCP.
  • Champion SRE best practices — defining SLOs/SLIs, conducting blameless postmortems, and continuously improving incident response.
  • Lead complex technical projects from conception to completion, managing timelines, and technical dependencies across teams.
  • Mentor engineers across teams, fostering a culture of reliability, automation, and continuous learning.
  • Collaborate with security and compliance partners to ensure infrastructure adheres to best practices and standards (e.g., IAM Federation, Workload Identity).
  • Participate in the on-call rotation, using incidents as learning opportunities to enhance systems and processes.
What You’ll Bring to the Role:
  • Strong hands-on experience architecting and operating cloud-native distributed systems (AWS and GCP).
  • Deep expertise with Kubernetes (EKS and GKE) — design, provisioning, scaling, and advanced troubleshooting in production.
  • Proven experience leading ECS to EKS/GKE migrations and driving microservice enablement initiatives at scale.
  • Proficiency with Infrastructure as Code tools such as Terraform (multi-provider), Ansible, or CloudFormation.
  • Solid coding and scripting ability in Python, Go, or Shell, with a focus on automation, tooling, and operational excellence.
  • Advanced understanding of CI/CD pipelines (ArgoCD, GitLab CI, Spinnaker), Linux systems, and networking fundamentals (Direct Connect/Interconnect, DNS, routing, load balancing).
  • Experience managing databases and caching systems (e.g., RDS/Cloud SQL, Redis/Memorystore, PostgreSQL, MySQL) in cloud environments.
  • Hands-on experience with observability tools (Prometheus, Grafana, ELK, Loki, OpenTelemetry, Google Cloud Operations) for performance and reliability insights.
  • Working knowledge of container security, secrets management (HashiCorp Vault, AWS Secrets Manager, Google Secret Manager), and compliance in production environments.
  • Strong communication and problem-solving skills, with demonstrated success leading cross-team projects and mentoring peers.

Experience:

  • 8+ years in SRE, DevOps, or Infrastructure Engineering roles.
  • 3–5 years of experience with Kubernetes (EKS/GKE) and related ecosystem tools (Helm, Karpenter, etc.) in production.
  • 3–5 years of experience with AWS and GCP.
  • 3–5 years using Terraform to manage multi-cloud infrastructure.
  • 5+ years of coding experience in Python, Go, or similar languages.
  • Proven track record leading high-impact projects, specifically migration projects (ECS → EKS/GKE) and enabling microservice architectures.
  • Experience implementing SLOs/SLIs, performing root cause analyses, and improving operational resilience.
  • Prior work in SaaS or high-scale, cloud-native environments is a strong plus.
  • Strong Linux and security fundamentals.
  • Bachelor’s degree in Computer Science or equivalent hands-on experience.

What you can look forward to as a Full-Time Okta employee!

  • Amazing Benefits
  • Making Social Impact
  • Developing Talent and Fostering Connection + Community at Okta

Some roles may require travel to one of our office locations for in-person onboarding.

Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws.
If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please use this Form to request an accommodation.
Notice for New York City Applicants & Employees: Okta may use Automated Employment Decision Tools (AEDT), as defined by New York City Local Law 144, that use artificial intelligence, machine learning, or other automated processes to assist in our recruitment and hiring process. In accordance with NYC Local Law 144, if you are an applicant or employee residing in New York City, please click here to view our full NYC AEDT Notice.
Okta is committed to complying with applicable data privacy and security laws and regulations. For more information, please see our Personnel and Job Candidate Privacy Notice at https://www.okta.com/legal/personnel-policy/.

Top Skills

Ansible
AWS
Chef
Ci/Cd
Ecs
Go
Kubernetes
Linux
Python
Rancher
Rust
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
6,000 Employees
Year Founded: 2009

What We Do

Okta is the leading independent identity provider. The Okta Identity Cloud enables organizations to securely connect the right people to the right technologies at the right time. With more than 7,000 pre-built integrations to applications and infrastructure providers, Okta provides simple and secure access to people and organizations everywhere, giving them the confidence to reach their full potential. More than 10,000 organizations, including JetBlue, Nordstrom, Siemens, Slack, T-Mobile, Takeda, Teach for America, and Twilio, trust Okta to help protect the identities of their workforces and customers.

Similar Jobs

Zscaler Logo Zscaler

Site Reliability Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Hybrid
Bangalore, Bengaluru, Karnataka, IND
8697 Employees

Visa Inc, Logo Visa Inc,

Site Reliability Engineer

Fintech • Information Technology • Payments
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
33000 Employees

Netskope Logo Netskope

Site Reliability Engineer

Cloud • Security • Software • Cybersecurity
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
1479 Employees

Visa Inc, Logo Visa Inc,

Site Reliability Engineer

Fintech • Information Technology • Payments
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
33000 Employees

Similar Companies Hiring

Toro TMS Thumbnail
Cloud • Enterprise Web • Sales • Software • Transportation
Chicago, IL
80 Employees
Yooz Thumbnail
Software • Machine Learning • Fintech • Financial Services • Cloud • Automation • Artificial Intelligence
Aimargues, FR
470 Employees
Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account