Job Description:
Principal Site Reliability Engineer
Our Site Reliability Engineering group within Enterprise Infrastructure combines Operations Excellence with the Development Experience to deliver services at high scale, high availability with resilience by using automation and Infrastructure Code. We build reliability into our ecosystem by applying best practices in Resiliency Engineering, Automation, Observability & Chaos Testing.
The team comes from diverse technical backgrounds, and the responsibilities provide the opportunity for a variety of challenges. Ideal candidates will have a background in either software engineering or systems engineering with a desire to learn the other or previous experience as an SRE. We are looking for a Systems Thinking, SRE Principal Engineer who has helped teams scale through production insights, operational automation, developer guidance, real-time metrics, automation, automation, automation...!
The Expertise We’re Looking For
- Bachelor’s degree or higher in a technology related field (e.g. Engineering, Computer Science.) required, master’s degree a plus.
- 8+ years of hands-on experience deploying and/or supporting highly distributed multi-tiered systems at scale.
- Hands-on experience with Public Cloud, preferably AWS or Azure.
- Hands-on experience with EKS, AKS OR Rancher Kubernetes Service container orchestration.
- Experience operating and implementing distributed & highly concurrent service-based architectures, including microservices, containerized services, and/or serverless architectures.
- Thought leadership and an ability to handle production incidents.
The Skills You Bring
- Hands-on Kubernetes skills and knowledge.
- Programming/development track record with a compiled/OOP-geared language like C# or Java and scripting/interpreted language experience like JavaScript/TypeScript or Python.
- Proven experience in maintaining scalability and resiliency of complex environment.
- Demonstrated ability to utilize modern monitoring tools (Datadog, Prometheus, Splunk, …)
- Experienced in Instrumentation with systems skills on building and operating, monitoring, logging, alerting services of distributed systems at scale.
- Understand, Implement and be accountable for the Production Services/SRE Capabilities across Digital Security. This includes a direct knowledge of the capabilities, usage & value, gaps and challenges.
- Technical & Operational leadership and be an escalation point of contact during major incidents or issues that are not resolved in the expected timeframes. Hands-on responsibility to actively lead Production bridges during major incidents working across the team and the Enterprise Infrastructure organization.
- Responsible for the execution and quality controls for the Fidelity Brokerage Business Unit Specific Post-Mortem reviews for the team including deep technical RCA, Observability & Automation reviews and act as the connection across Enterprise Infrastructure domains in the region.
The Value You Deliver
- Help define and execute a comprehensive reliability and observability strategy, ensuring that Fidelity’s systems are always available when our customers need them.
- Bring together technical, procedural, and financial data to reduce toil and increase efficiency.
- You will execute plans for technical standardization and process refinement within the engineering organization, especially for Site Reliability Engineers.
- Troubleshoot stack-wide engineering issues related to hardware, software, network, applications and cloud service providers.
- Coach peer SREs and development teams on how to build highly available systems.
For more like this search #SWE
Category:Information Technology
Top Skills
What We Do
At Fidelity, our goal is to make financial expertise broadly accessible and effective in helping people live the lives they want. We do this by focusing on a diverse set of customers: - from 23 million people investing their life savings, to 20,000 businesses managing their employee benefits to 10,000 advisors needing innovative technology to invest their clients’ money. We offer investment management, retirement planning, portfolio guidance, brokerage, and many other financial products.
Privately held for nearly 70 years, we’ve always believed by providing investors with access to the information and expertise, we can help them achieve better results. That’s been our approach- innovative yet personal, compassionate yet responsible, grounded by a tireless work ethic—it is the heart of the Fidelity way.