Site Reliability Engineer Lead

Posted 3 Days Ago
Be an Early Applicant
Plano, TX, USA
In-Office
Senior level
Big Data • Fintech • Mobile • Payments • Financial Services • Data Privacy
The Role
Lead SRE partnering with development and infrastructure teams to implement monitoring, automation, reliability tooling, alerting, and on-call routines; develop reliability scripts and libraries; triage major incidents; reduce toil and improve observability; decompose work and mentor SRE resources.
Summary Generated by Built In

Job Description:

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.
Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being an inclusive workplace, attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.
Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations.
At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!

This job is responsible for partnering with engineering and technology teams to implement measures prescribed by the Site Reliability Engineer teams it leads. Key responsibilities include ensuring appropriate instrumentation, tooling, ticketing, alerting and on call routines are in place for key services, demonstrating technical expertise within domains, and decomposing objectives into work units. Job expectations include advancing efficient solution delivery practices and promoting exceptional design, engineering, and organizational practices.

The individual in this role is accountable for establishing and maintaining partnerships with Application Development and Production Support teams to implement the measures prescribed through the collaboration of the Senior Site Reliability Engineer (SRE) and the SRE team(s) they are leading. This individual will include ensuring the appropriate instrumentation, tooling, ticketing, alerting and on-call routines are in place for key services. This role demonstrates a high level of technical expertise within one or more technical domains. This role demonstrates the ability to decompose issues or objectives into units of work that can be assigned to other team members. This individual will advocate and advance more efficient solution delivery practices and evangelize great design, engineering and organizational practices.

Responsibilities:

  • Collaborates with Development and Infrastructure teams to understand technical solutions and implement monitoring capabilities outlined in the application and system monitoring designs put forward by the Senior Site Reliability Engineer (SRE)

  • Develops and maintains reliability scripts, tools and libraries and leverages them for common instrumentation, automation, and operational needs, and when mentoring SRE resources on reliability practices and established tools/capabilities

  • Partners to implement code changes to make use of common reliability libraries and tools and helps Application Production Services and Application Development teammates understand how to use them

  • Participates regularly in architecture community of practice meetings and communication via other channels

  • Identifies vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and 'noise' in monitoring, and defines solutions to reduce manual support effort and/or improve system reliability

  • Engages as a subject matter expert in major incident triage efforts and failure scenario modelling and diagnosis with Problem Manager root causes for major incident/problem management investigations

  • see position summary required/desired qualifications

Required Qualifications:

  • 5+ years of experience in platform, systems, or infrastructure engineering, with a strong focus on automation and integration

  • Proficiency in SRE best practices; Proven ability to reduce toil and improve observability of the environment

  • Experience with automation and orchestration tools (e.g., Ansible or similar), and scripting with golang, Python, or equivalent

  • Experience with supporting enterprise service mesh platforms

  • Experience with Infrastructure as Code (IaC) concepts and CI/CD pipelines supporting automated builds, validation, and deployments

  • Experience integrating provisioning workflows with platform services such as virtualization, networking, identity, monitoring, and configuration management systems

  • Strong focus on testing and reliability, including automated integration/validation testing and troubleshooting of complex workflows

Desired Qualifications:

  • Linux System Administration

  • Splunk Administration

  • OpenShift Containers

  • Dyantrace Administration

  • Grafana

  • Ansible Automation

  • Horizon CI/CD (Jenkins, XLR, Artifactory, BitBucket)

  • Azure/AWS\GCP Cloud

  • Fast learner

  • Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities

  • Systematic problem-solving approach, sense of ownership and drive

  • Ability to juggle competing priorities and adapt to changes in project scope

Skills:

  • Automation

  • Collaboration

  • Influence

  • Production Support

  • Result Orientation

  • Analytical Thinking

  • Application Development

  • Architecture

  • Solution Design

  • Stakeholder Management

  • Other

  • Terraform

Shift:

1st shift (United States of America)

Hours Per Week: 

40

Skills Required

  • 5+ years of experience in platform, systems, or infrastructure engineering with focus on automation and integration
  • Proficiency in SRE best practices and proven ability to reduce toil and improve observability
  • Experience with automation and orchestration tools (e.g., Ansible) and scripting with golang or Python
  • Experience supporting enterprise service mesh platforms
  • Experience with Infrastructure as Code (IaC) concepts and CI/CD pipelines for automated builds, validation, and deployments
  • Experience integrating provisioning workflows with platform services (virtualization, networking, identity, monitoring, configuration management)
  • Strong focus on testing and reliability, including automated integration/validation testing and troubleshooting complex workflows
  • Terraform
  • Linux System Administration
  • Splunk Administration
  • OpenShift Containers
  • Dynatrace Administration
  • Grafana
  • Ansible Automation (listed also as desired)
  • Horizon CI/CD tools (Jenkins, XLR, Artifactory, BitBucket)
  • Azure, AWS or GCP cloud experience
  • Proven ability to work independently and as part of a team; strong ownership and problem-solving

Bank of America Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Bank of America and has not been reviewed or approved by Bank of America.

  • Fair & Transparent Compensation The $25/hour U.S. minimum wage, reaffirmed in recent company materials, sets a clear compensation floor that lifts entry-level and operations pay. Public salary information and disclosures provide visible benchmarks for pay across roles.
  • Parental & Family Support Parental leave extends up to 26 weeks with 16 weeks fully paid for eligible teammates, alongside backup child and adult care and a dedicated Life Event Services team. Family-building assistance offers up to a $20,000 lifetime reimbursement and bereavement leave provides 20 paid days for loss of a spouse, partner, or child.
  • Retirement Support Retirement programs include a 401(k) match up to 5% of eligible pay plus an additional 2–3% annual company contribution based on service. These employer contributions add meaningful long-term value beyond base pay.

Bank of America Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Charlotte, NC
208,000 Employees
Year Founded: 1784

What We Do

We make financial lives better for our clients and our communities through the power of every connection. Our employees are at the heart of this purpose, and are key to driving responsible growth. Every day, across the globe, our employees bring a commitment to our purpose and to driving responsible growth by living our values: deliver together, act responsibly, realize the power of our people and trust the team. A key aspect of driving responsible growth is doing so in a sustainable manner, a critical pillar of which is being a great place to work for our teammates.

Gallery

Gallery

Similar Jobs

Optum Logo Optum

Site Reliability Engineer

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Richardson, TX, USA
160000 Employees
157K-210K Annually

Cox Automotive Inc. Logo Cox Automotive Inc.

Site Reliability Engineer

Automotive • Information Technology • Logistics • Software
In-Office
Austin, TX, USA
167K-204K Annually
In-Office
2 Locations
9809 Employees
145K-217K Annually

GM Financial Logo GM Financial

Site Reliability Engineer

Fintech • Financial Services
Hybrid
Arlington, TX, USA
7790 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account