Manager of Site Reliability Engineering at H-E-B
Our Partners thrive The H-E-B Way. In the Manager of SRE job, that means you have a...
HEART FOR PEOPLE... you have a passion for mentorship and guidance, and love for the direct person-to-person interactions that create strong bonds between teams
HEAD FOR BUSINESS... you have an ownership mentality and a consistent track record of timely delivery of high-quality software
PASSION FOR RESULTS... the ability to guide the discussion, remove roadblocks, and provide guardrails for your team as they identify challenges and propose solutions
What you'll do at HEB:
We make capable the successful operation, secure modification, and agile creation of large-scale fault tolerant systems which delight our customers beyond expectation...which is easier said than done.
As a SRE Manager, your job is to join in that mission with a squad of other tenacious engineers to ensure the world-class performance, efficiency, change management, monitoring, capacity planning and emergency response policies of our software, infrastructure, and it's dependencies. Your goal ultimately is to engineer operationally efficient & performant solutions, increase system observability, minimize human interactions with production systems, accelerating customer value delivery, and to help evangelize those best practices to others.
We enable the reliability that makes building fantastic HEB Digital products possible -- and we are incredibly proud of that.
- Coach and mentor junior and senior engineers in engineering techniques, processes, and new technologies; enable others to succeed.
- Improve observability pipeline and establish baseline capabilities for service level indicators
- Engage in and improve the software delivery lifecycle from establishing acceptance criteria through deployment, operation, and refinement.
- Ensure success of new services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and readiness reviews.
- Engage and nurture other squads to be capable of maintaining services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through automation, agile improvement, and dynamic resource utilization.
- Practice sustainable incident management and foster a blameless retrospective culture.
Who We Are
- H-E-B is one of the largest, independently owned food retailers in the nation, operating over 400 stores throughout Texas and Mexico, with annual sales generating over $26 billion
- We hire talented people (116,000+ Partners), and give them autonomy to be creative in how they impact the business
- We're a Partner-driven company with a Bold Promise - Because People Matter
- We embrace Diversity and Inclusion as core values, and support them with thriving company-wide programs
- We're a truly original Texas-based company that created the Spirit of Giving to help Texas communities every day
- Once eligible, our Partners become Owners in the company. "Partner-owned" means our most important resources-People-drive the innovation, growth, and success that make H-E-B The Greatest Retailing Company