Senior Site Reliability Engineer, Observability and Monitoring Team

Posted 21 Days Ago
Be an Early Applicant
Bellevue, WA
In-Office
159K-239K Annually
Senior level
Cloud
The Role
As a Senior Site Reliability Engineer, you will lead observability initiatives, manage stakeholders, automate observability data processes, and support a 24x7 online environment.
Summary Generated by Built In

Get to know Okta
Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth.
At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we’re looking for lifelong learners and people who can make us better with their unique experiences. 
Join our team! We’re building a world where Identity belongs to you.

We're searching for a Senior Site Reliability Engineer (SRE) with a profound passion for observability to join our team. This isn't just a hands-on role; you'll be a thought leader, shaping the strategy and execution of our observability services—logs, metrics, and tracing—both within the Observability team and across the broader organization. We're looking for someone who can help us see clearly when things get cloudy!

Your expertise in Kubernetes will be crucial as we undergo a significant replatforming initiative.  You will guide the design, implementation, and operation of our advanced observability capabilities on the new platform.

A cornerstone of this role is your exceptional ability to manage and influence stakeholders, ensuring their needs are met, expectations are managed, and they're delighted with the insights our observability services provide. We believe that our important stakeholders deserve metric-ulous attention.

What You'll Be Doing
  • Becoming deeply familiar with all corners of a critical SaaS platform utilized by millions of customers daily, with an eye towards providing unparalleled observability insights into its behavior and performance.
  • Engaging with stakeholders across the group to not only understand their component boundaries and dependencies but also to drive the adoption of observability best practices as a guide and coach for your teammates and the wider engineering organization. 
  • Championing the evolution of our SDLC: defining how we ideate, onboard, operate, and scale microservices and features in a secure, performant, always-on manner, with observability (logs, metrics, tracing) as a foundational element from inception.
  • Identifying, understanding, and automating away manual processes through clever code and smart architecture, particularly focusing on how automation can enhance the collection, analysis, and actionability of observability data. 
  • Supporting a 24x7 online environment as part of a global on-call rotation, leveraging your deep observability expertise to rapidly identify, diagnose, and resolve the most complex incidents. 
  • Advocating for and establishing best practices for scalable, reliable, and resilient systems and services across all of WIC engineering, with a strong emphasis on fostering an observability-driven culture.
What You'll Bring to the Role
  • 4+ years of experience as a site reliability or platform engineer, preferably in a fast-scaling environment, with a significant and demonstrable track record in leading observability initiatives.
  • 2+ years of experience designing, scaling, and operating observability solutions for applications within a Kubernetes environment.  You’ll be adept at leveraging Kubernetes capabilities to gain insights into workload performance and health. 
  • Familiarity with large-scale containerized deployments, both microservice and monolithic, coupled with a deep understanding of their unique observability challenges and solutions.
  • A proactive and tenacious mindset: always willing to go the extra mile to identify a problem and drive its resolution, especially when it pertains to improving system visibility and reliability.
  • A strong passion for mentoring and encouraging the development of engineering peers, leading by example in adopting and promoting robust observability practices.
  • Deep knowledge of CI/CD principles, Linux fundamentals, OS hardening, networking concepts, and Internet protocols, applied strategically to build resilient and observable systems.
  • Strong skills in multiple operational tooling languages such as Python, Rust, or Go, for automating sophisticated observability tasks and integrations.
  • Proven ability to effectively manage and influence diverse stakeholders, translating complex technical observability concepts into clear, actionable insights, and ensuring high levels of satisfaction with observability services.
  • Expert proficiency with Splunk or similar for large-scale log management and advanced analysis.
  • Extensive experience with Grafana for designing and implementing sophisticated dashboards and visualizations of critical metrics.

#LI-LSS1

The annual base salary range for this position for candidates located in the San Francisco Bay area is between:
$159,000$239,000 USD

Below is the annual base salary range for candidates located in California, Colorado, New York and Washington. Your actual base salary will depend on factors such as your skills, qualifications, experience, and work location. In addition, Okta offers equity (where applicable), bonus, and benefits, including health, dental and vision insurance, 401(k), flexible spending account, and paid leave (including PTO and parental leave) in accordance with our applicable plans and policies. To learn more about our Total Rewards program please visit: https://rewards.okta.com/us.   

The annual base salary range for this position for candidates located in California (excluding San Francisco Bay Area), Colorado, New York, and Washington is between:
$142,000$214,000 USD

What you can look forward to as a Full-Time Okta employee!

  • Amazing Benefits
  • Making Social Impact
  • Developing Talent and Fostering Connection + Community at Okta

Okta cultivates a dynamic work environment, providing the best tools, technology and benefits to empower our employees to work productively in a setting that best and uniquely suits their needs. Each organization is unique in the degree of flexibility and mobility in which they work so that all employees are enabled to be their most creative and successful versions of themselves, regardless of where they live. Find your place at Okta today! https://www.okta.com/company/careers/.
Some roles may require travel to one of our office locations for in-person onboarding.

Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws.
If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please use this Form to request an accommodation.

Okta is committed to complying with applicable data privacy and security laws and regulations. For more information, please see our Privacy Policy at https://www.okta.com/privacy-policy/. 

Top Skills

Go
Grafana
Kubernetes
Python
Rust
Splunk
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
6,000 Employees
Year Founded: 2009

What We Do

Okta is the leading independent identity provider. The Okta Identity Cloud enables organizations to securely connect the right people to the right technologies at the right time. With more than 7,000 pre-built integrations to applications and infrastructure providers, Okta provides simple and secure access to people and organizations everywhere, giving them the confidence to reach their full potential. More than 10,000 organizations, including JetBlue, Nordstrom, Siemens, Slack, T-Mobile, Takeda, Teach for America, and Twilio, trust Okta to help protect the identities of their workforces and customers.

Similar Jobs

Square Logo Square

Staff Software Engineer

eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Remote or Hybrid
Seattle, WA, USA
264K-395K Annually

ServiceNow Logo ServiceNow

Executive Assistant

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Hybrid
Seattle, WA, USA

ServiceNow Logo ServiceNow

Software Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Kirkland, WA, USA
164K-286K Annually

CoreWeave Logo CoreWeave

Senior Software Engineer

Cloud • Information Technology • Machine Learning
In-Office
4 Locations
139K-204K

Similar Companies Hiring

Rundoo Thumbnail
Software • Internet of Things • Information Technology • Cloud
Redwood City, , California
50 Employees
Yooz Thumbnail
Software • Machine Learning • Fintech • Financial Services • Cloud • Automation • Artificial Intelligence
Aimargues, FR
470 Employees
Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account