Staff Site Reliability Engineer- Splunk Expert

Reposted 18 Days Ago
Be an Early Applicant
Bengaluru, Bengaluru Urban, Karnataka, IND
In-Office
Senior level
Cloud
The Role
The Senior Site Reliability Engineer will architect and evolve the observability ecosystem, automate infrastructure deployment, and optimize telemetry data processing.
Summary Generated by Built In

Secure Every Identity, from AI to Human
Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence.
This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk.

Workforce Identity Cloud

Okta Workforce Identity Cloud (WIC) provides easy, secure access for your workforce so you can focus on other strategic priorities—like reducing costs, and doing more for your customers.

If you like to be challenged and have a passion for solving large-scale automation, testing, and tuning problems, we would love to hear from you. The ideal candidate is someone who exemplifies the ethics of, “If you have to do something more than once, automate it” and who can rapidly self-educate on new concepts and tools.

Position Overview

We are seeking a highly technical Staff Site Reliability Engineer with deep expertise in Splunk and Grafana to own and evolve our observability ecosystem. In this role, you will move beyond simple monitoring to architect a comprehensive, scalable telemetry platform. You will be our subject-matter expert in Splunk optimisation, ensuring our logging architecture is performant, cost-effective, and deeply integrated with our automated workflows.

You will treat infrastructure as code—utilising Terraform and strong coding proficiency in Go, Python, or Ruby—to automate the deployment of agents and collectors across complex distributed systems.

Key Responsibilities
  • Splunk Architecture & Optimisation: Lead the design and tuning of Splunk environments. Optimise indexer performance, search efficiency, and data models to ensure rapid troubleshooting and cost-efficiency.
  • Advanced Visualisation: Architect and maintain sophisticated Grafana dashboards that correlate disparate data sources into a single pane of glass for real-time system health.
  • Automated Infrastructure: Design, build, and maintain scalable observability infrastructure using tools like Terraform.
  • Pipeline Engineering: Optimise the collection, processing, and storage of telemetry data (Metrics, Logs, Traces) to ensure high reliability and low latency.
  • Workflow Automation: Develop custom Splunk workflows and integrations that trigger automated responses to system events, reducing Mean Time to Resolution (MTTR).
  • Incident Response: Participate in on-call rotations and lead post-incident reviews to drive systemic improvements through "observability-driven development."
Required Skills & Experience (The Essentials)
  • Splunk Mastery: Deep, hands-on experience with Splunk administration, search optimisation (SPL), and architecting complex data pipelines. You know how to make Splunk "hum" at scale.
  • Grafana Expertise: Proven ability to build actionable, intuitive dashboards in Grafana that go beyond simple charts to provide deep operational insights.
  • SRE Mindset: Minimum 8+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems.
  • Programming Proficiency: Strong coding skills in Go, Python, or Ruby for building internal tools and automating observability workflows.
  • Telemetry Standards: Hands-on experience with OpenTelemetry (OTel), Prometheus, or similar frameworks for instrumenting applications.
  • Distributed Systems: Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container orchestration (Kubernetes/EKS).
Bonus Skills (The "Nice-to-Haves")
  • Tracing: Implementation of distributed tracing (Jaeger, Tempo, or Honeycomb) to visualise request flow across microservices.
  • Security Observability: Experience using Splunk for security orchestration (SOAR) or SIEM-related workflows.
  • Cloud Platforms: Experience managing observability native tools within AWS, Azure, or GCP.

#LI-Hybrid


P22381_3143209


The Okta Experience

  • Supporting Your Well-Being 
  • Driving Social Impact 
  • Developing Talent and Fostering Connection + Community

We are intentional about connection. Our global community, spanning over 20 offices worldwide, is united by a drive to innovate. Your journey begins with an immersive, in-person onboarding experience designed to accelerate your impact and connect you to our mission and team from day one.
Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws.
If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please use this Form to request an accommodation.
Notice for New York City Applicants & Employees: Okta may use Automated Employment Decision Tools (AEDT), as defined by New York City Local Law 144, that use artificial intelligence, machine learning, or other automated processes to assist in our recruitment and hiring process. In accordance with NYC Local Law 144, if you are an applicant or employee residing in New York City, please click here to view our full NYC AEDT Notice.
Okta is committed to complying with applicable data privacy and security laws and regulations. For more information, please see our Personnel and Job Candidate Privacy Notice at https://www.okta.com/legal/personnel-policy/.

Skills Required

  • Minimum 4+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems
  • Strong coding skills in Go, Python, or Ruby for building internal tools and automating workflows
  • Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container orchestration (Kubernetes/EKS)
  • Hands-on experience with OpenTelemetry, Prometheus, or similar frameworks for instrumenting applications
  • A data-driven approach to debugging complex, cross-service performance bottlenecks

Okta Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Okta and has not been reviewed or approved by Okta.

  • Healthcare Strength Health coverage spans medical, dental, vision, mental-health support, and income protection, complemented by preventive care options and wellness resources. These elements indicate robust coverage for both routine needs and more complex situations.
  • Parental & Family Support Policies include paid parental leave, adoption and surrogacy assistance, and fertility and family‑building benefits. Caregiving resources and flexible arrangements help employees navigate family responsibilities.
  • Leave & Time Off Breadth Flexible or unlimited PTO, separate sick time, paid holidays, and a company Wellbeing Week provide multiple avenues for time away. This breadth supports rest, recovery, and work‑life balance.

Okta Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
6,000 Employees
Year Founded: 2009

What We Do

Okta is the leading independent identity provider. The Okta Identity Cloud enables organizations to securely connect the right people to the right technologies at the right time. With more than 7,000 pre-built integrations to applications and infrastructure providers, Okta provides simple and secure access to people and organizations everywhere, giving them the confidence to reach their full potential. More than 10,000 organizations, including JetBlue, Nordstrom, Siemens, Slack, T-Mobile, Takeda, Teach for America, and Twilio, trust Okta to help protect the identities of their workforces and customers.

Similar Jobs

Capital One Logo Capital One

Distinguished Engineer

Fintech • Machine Learning • Payments • Software • Financial Services
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
55000 Employees

Capital One Logo Capital One

Manager, Product Management

Fintech • Machine Learning • Payments • Software • Financial Services
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
55000 Employees

Optum Logo Optum

Consultant

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
160000 Employees

Toast Logo Toast

Staff Software Engineer

Cloud • Fintech • Food • Information Technology • Software • Hospitality
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
5000 Employees

Similar Companies Hiring

NetBox Labs Thumbnail
Cloud • Software
US
125 Employees
Yooz Thumbnail
Software • Machine Learning • Fintech • Financial Services • Cloud • Automation • Artificial Intelligence
Aimargues, FR
470 Employees
Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account