Production Support SRE Manager
Plano 1 (31061), United States of America, Plano, Texas
Production Support SRE Manager
Production Support SRE Manager
We are looking for an experienced Production Support SRE Manager with an operational or site reliability engineering background with a passion for providing superior system availability and customer experience. We are looking for candidates who can lead a 24/7 support organization and drive reliability and performance across a massive scale by mastering the full depth of the stack. As a Manager, you will have the opportunity to tackle complex problems of scale which are unique to tech companies while using your expertise in delivery and support of critical services.
This position is responsible for providing the operational support for Capital One Data ecosystem by managing the TOC Data pod. TOC Data Pod provides 24/7 service and rapid response for all of Capital One data transformation including Enterprise Data & Machine Learning and Financial Services Data platforms. The team is focused on driving production engineering innovations for our customers to continuously improve our production engineering processes.
Responsibilities:
- Increase operational efficiencies to proactively reduce and mitigate production incidents.
- Provide Call Leadership to mitigate critical incidents
- Lead a team of experienced support engineers to meet or exceed expectations on incident SLAs
- Lead a high performing team of support engineers across several geographical locations to provide a 24x7 support for systems with an ever-watchful eye on their availability, latency, performance, and capacity
- Collaborating with other tech leads and support teams to ensure integrated end-to-end availability, reliability, and performance
- Define support strategies for systems in the Cloud
- Influencing resiliency and scalability in production environments in Amazon Web Services and other cloud platforms
- Identify and drive resolution on monitoring and alerting gaps
- Lead a team to design, write and deliver technical and process automation to improve the availability, scalability, latency, and efficiency of Capital One's services
- Solve problems relating to mission-critical services and build automation to prevent problem recurrence; with the goal of automated response to all non-exceptional service conditions
- Engage in service capacity planning and demand forecasting, software performance analysis and system tuning
- Identifying and remediating risk to critical and non-critical system KPIs
Basic Qualifications:
- Bachelor's Degree or military experience
- At least 3 years of experience in managing production support teams
- At least 1 year of experience in cloud services configuration and administration
- At least 1 year of experience in restful web and API services support and deployment
- At least 2 years of people management experience
Preferred Qualifications:
- 2+ years of experience in cloud services configuration and administration
- 1+ year of experience with scripting language(s) such as Python to debug, optimize code, and automate routine tasks
- 1+ year of experience with Splunk, Datadog or New Relic monitoring and alerting
- Current Associate Level cloud certification (AWS Solution Architect, Developer or SysOps)
At this time, Capital One will not sponsor a new applicant for employment authorization for this position.
No agencies please. Capital One is an Equal Opportunity Employer committed to diversity and inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to sex, race, color, age, national origin, religion, physical and mental disability, genetic information, marital status, sexual orientation, gender identity/assignment, citizenship, pregnancy or maternity, protected veteran status, or any other status prohibited by applicable national, federal, state or local law. Capital One promotes a drug-free workplace. Capital One will consider for employment qualified applicants with a criminal history in a manner consistent with the requirements of applicable laws regarding criminal background inquiries, including, to the extent applicable, Article 23-A of the New York Correction Law; San Francisco, California Police Code Article 49, Sections 4901-4920; New York City's Fair Chance Act; Philadelphia's Fair Criminal Records Screening Act; and other applicable federal, state, and local laws and regulations regarding criminal background inquiries.
If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation, please contact Capital One Recruiting at 1-800-304-9102 or via email at [email protected] . All information you provide will be kept confidential and will be used only to the extent required to provide needed reasonable accommodations.
For technical support or questions about Capital One's recruiting process, please send an email to [email protected]
Capital One does not provide, endorse nor guarantee and is not liable for third-party products, services, educational tools or other information available through this site.
Capital One Financial is made up of several different entities. Please note that any position posted in Canada is for Capital One Canada, any position posted in the United Kingdom is for Capital One Europe and any position posted in the Philippines is for Capital One Philippines Service Corp. (COPSSC).