Site Reliability Engineering Manager, Monitoring at Okta
We are looking for an experienced Site Reliability Engineering Manager to join Okta’s Technical Operations Team. At Okta our motto is "Always On", and nowhere do we embrace that more than in Technical Operations. We strive to build the most reliable and performant systems on the planet through the skillful use of automation. We've created an integrated system that securely connects any person via any device to the technologies they need to do their most significant work.
This SRE Manager role is ideal for someone who has a love of monitoring and automation technology and enjoys seeing their team grow and succeed. The monitoring team at Okta is instrumental in providing visibility into Okta's large scale production environment and helping our customers high availability.
The ideal candidate:
Has a track record of leading or managing high-performing teams whilst still being hands-on.
- Has production experience with AWS cloud-based infrastructure.
- Has operated complex custom applications on UNIX/Linux and/or Enterprise Java platforms
- Is passionate about monitoring, observability and actionable alerts at scale.
- Is a champion for automation and leveraging agile software development methodologies
- Has in-depth knowledge of industry standard commercial or open source monitoring tools
Job Duties and Responsibilities:
- Mentor and manage a team of experienced engineers using agile development
- Manage and own delivery of monitoring components:
- Collaborate with TPM, architects, and executive management
- Design and code reviews
- Partner with Okta security teams.
- Partner with recruiting to hire staff
- Continuously refine monitoring processes, thresholds, and configuration
- Respond to issues and escalations and participate in a management on-call rotation
- Work closely with our monitoring tool vendors to drive improvement and economies
Minimum REQUIRED Knowledge, Skills, and Abilities:
- Demonstrate a track record of leading or managing a team
- Experience with Amazon Web Services and knowledge of building and configuring AWS services
- Experience with managing Linux Systems in production.
- Proficient in at least one scripting language (Bash, Perl, Ruby, Python)
- Production use of one of the following tools Splunk, Wavefront, Zabbix, ELK, Prometheus or Grafana
- Prior experience in software development, DevOps role, or SRE role
- Builds Effective Teams: Building strong-identity teams that apply their diverse skills and perspectives to achieve common goals.
- Demonstrates Self-Awareness (EQ): Using a combination of feedback and reflection to gain productive insight into personal strengths and weaknesses.
- Develops Talent: Developing people to meet both their career goals and the organization’s goals.
- Drives Results: Consistently achieving results, even under tough circumstances.
- Strategic Mindset: Seeing ahead to future possibilities and translating them into breakthrough strategies.
Okta is an Equal Opportunity Employer.
Okta is rethinking the traditional work environment, providing our employees with the flexibility to be their most creative and successful versions of themselves, no matter where they are located. We enable a flexible approach to work, meaning for roles where it makes sense, you can work from the office, or from home, regardless of where you live. Okta invests in the best technologies and provides flexible benefits and collaborative work environments/experiences, empowering employees to work productively in a setting that best and uniquely suits their needs. Find your place at Okta https://www.okta.com/company/careers/.
By submitting an application, you agree to the retention of your personal data for consideration for a future position at Okta. More details about Okta’s privacy practices can be found at: https://www.okta.com/privacy-policy.