Senior Systems Operations Engineer - SRE and AIOps

Posted Yesterday
Be an Early Applicant
Hyderabad, Telangana, IND
Hybrid
Senior level
Fintech • Financial Services
Wells Fargo: Tech-powered. Innovation-led. We're transforming financial services.
The Role
About this role:
Wells Fargo is seeking a Senior Systems Operations Engineer within the Enterprise Functions Technology, Center of Excellence platform engineering team to deliver and support cloud workloads and services, provide engineering support and drive modernization of critical cloud capabilities.
In this role, you will:
  • Lead or participate in managing all installed systems and infrastructure within the Systems Operations functional area
  • Contribute in increasing system efficiencies and lowering the human intervention time on related tasks
  • Review and analyze moderately complex operational support systems, application software, and system management tools to ensure the highest levels of systems and infrastructure availability
  • Work with vendors and other technical personnel for problem resolution
  • Lead team to meet technical deliverables while leveraging solid understanding of technical process controls or standards
  • Collaborate with vendors and other technical personnel to resolve technical issues and achieve highest levels of systems and infrastructure availability

Required Qualifications:
  • 4+ years of Systems Engineering, Technology Architecture experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education

Desired Qualifications:
  • Set and evangelize the SRE and AIOps technical strategy for EFT, establishing reference architectures, standards, and guardrails (service tiering, onboarding criteria, SLO/error budget governance) and holding teams accountable through transparent executive-level reporting.
  • Own the reliability and observability architecture across hybrid/multi-cloud, driving standardization of monitoring, logging, tracing, synthetics, and resilience/chaos testing; define platform patterns that teams can adopt with minimal friction.
  • Design and implement AIOps and automation platforms (event correlation, anomaly detection, runbook automation, self-healing) with strong engineering discipline (testability, auditability, change safety) and prioritize initiatives that materially reduce incident volume, toil, and MTTR.
  • Define the reliability measurement system (SLIs/SLOs, error budgets, customer impact, MTTR/MTBF, change failure rate) and build reusable dashboards and alerts that drive consistent prioritization, investment decisions, and engineering behavior across teams.
  • Provide technical leadership during major incidents for critical services, driving rapid triage, clear stakeholder communications, and cross-domain coordination; institutionalize blameless post-incident reviews and engineering mechanisms that eliminate systemic causes.
  • Partner with application, platform, and architecture leaders to embed reliability into planning and delivery (design and architecture reviews, operational readiness gates, non-functional requirements, capacity/performance engineering), influencing roadmaps based on quantified risk and customer impact.
  • Lead multi-quarter, cross-organization reliability transformations (e.g., platform modernization, resilience programs, observability convergence), delivering reusable capabilities and operating mechanisms that improve reliability posture and reduce operational risk at scale.
  • Strong Java / backend service development experience
  • Distributed systems and API-based service design
  • CI/CD pipelines and Git-based workflows
  • 3+ years of experience with scripting and infrastructure automation using Terraform
  • 3+ years of hands-on experience with OpenShift, GCP or Azure platform enablement and application migrations, build out of complex infrastructure programmable patterns using Infrastructure as Code (IaC)
  • 2+ years of knowledge and understanding of Cloud service offerings such as data, analytics, AL/ML on GCP or Azure
  • 2+ years of experience with key services provided by Azure and/or GCP such as BigQuery, Vertix AI, DataProc, Functions. AKS, Service Fabric
  • 2+ years working in a globally distributed team to provide innovative and robust cloud centric solutions.
  • 2+ years gathering and analyzing data to diagnose the root cause of cloud workload issues, recommending and implementing solutions to resolve issues in timely manner.

Job Expectations:
  • Exposure to cloud governance and logging/monitoring tooling
  • Experience with Agile concepts and Site Reliability Engineering (SRE) Principles
  • Understanding, engineering and implementing disaster recovery and business continuity playbooks
  • Proficient on container-based solutions and services and have handled large scale Kubernetes based infrastructure build out and provisioning on OpenShift, Azure or GCP
  • Knowledge and understanding of Cloud Service offerings on OpenShift, Azure or GCP related to security, data protection, and policy implementations
  • Ability to articulate technical solutions to both technical and business partners
  • Good understanding of networking, firewalls, load balancing concepts (IP, DNS, Guardrails, Vnets) and exposure to database, cloud security, active directory, authentication methods, RBAC
  • SRE / Reliability
    • Production support mindset (incident response, on-call readiness)Observability: logging, metrics, tracing (Splunk/AppD/AppD-alikes)
    • Performance, availability, and reliability engineering concepts
    • Experience partnering with SRE or platform teams
  • Platform / Cloud
  • Kubernetes/OpenShift (deployments, troubleshooting, scaling)
  • Infrastructure-as-Code exposure (Terraform/Helm is a plus)

Posting End Date:
22 Jun 2026
*Job posting may come down early due to volume of applicants.
We Value Equal Opportunity
Wells Fargo is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other legally protected characteristic.
Employees support our focus on building strong customer relationships balanced with a strong risk mitigating and compliance-driven culture which firmly establishes those disciplines as critical to the success of our customers and company. They are accountable for execution of all applicable risk programs (Credit, Market, Financial Crimes, Operational, Regulatory Compliance), which includes effectively following and adhering to applicable Wells Fargo policies and procedures, appropriately fulfilling risk and compliance obligations, timely and effective escalation and remediation of issues, and making sound risk decisions. There is emphasis on proactive monitoring, governance, risk identification and escalation, as well as making sound risk decisions commensurate with the business unit's risk appetite and all risk and compliance program requirements.
Candidates applying to job openings posted in Canada: Applications for employment are encouraged from all qualified candidates, including women, persons with disabilities, aboriginal peoples and visible minorities. Accommodation for applicants with disabilities is available upon request in connection with the recruitment process.
Applicants with Disabilities
To request a medical accommodation during the application or interview process, visit Disability Inclusion at Wells Fargo .
Drug and Alcohol Policy
Wells Fargo maintains a drug free workplace. Please see our Drug and Alcohol Policy to learn more.
Wells Fargo Recruitment and Hiring Requirements:
a. Third-Party recordings are prohibited unless authorized by Wells Fargo.
b. Wells Fargo requires you to directly represent your own experiences during the recruiting and hiring process.

Skills Required

  • 4+ years of Systems Engineering or Technology Architecture experience (or equivalent)
  • Set and evangelize SRE and AIOps technical strategy, reference architectures, standards, and guardrails
  • Own reliability and observability architecture across hybrid/multi-cloud: monitoring, logging, tracing, synthetics, resilience/chaos testing
  • Design and implement AIOps and automation platforms: event correlation, anomaly detection, runbook automation, self-healing
  • Define SLIs/SLOs, error budgets, MTTR/MTBF and build dashboards/alerts for consistent prioritization
  • Provide technical leadership during major incidents and run blameless post-incident reviews
  • Strong Java and backend service development experience
  • Experience with distributed systems and API-based service design
  • Experience with CI/CD pipelines and Git-based workflows
  • 3+ years scripting and infrastructure automation using Terraform
  • 3+ years hands-on with OpenShift, GCP or Azure platform enablement and application migrations
  • 2+ years knowledge of cloud service offerings for data, analytics, AI/ML on GCP or Azure
  • 2+ years experience with BigQuery, Vertex AI, DataProc, Cloud Functions, AKS, Service Fabric
  • Experience working in globally distributed teams to deliver cloud-centric solutions
  • Experience gathering and analyzing data to diagnose cloud workload issues and implement solutions
  • Exposure to cloud governance, logging/monitoring tooling (observability)
  • Familiarity with Agile and Site Reliability Engineering (SRE) principles
  • Experience with disaster recovery and business continuity playbooks
  • Proficient with container-based solutions, large-scale Kubernetes/OpenShift infrastructure provisioning
  • Knowledge of cloud security, data protection, policy implementations, authentication methods
  • Good understanding of networking, firewalls, load balancing, IP/DNS, VNet concepts
  • Exposure to databases, Active Directory, and RBAC
  • Observability tooling experience: Splunk, AppDynamics or similar
  • Infrastructure-as-Code exposure; Terraform and Helm experience
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
205,000 Employees
Year Founded: 1852

What We Do

Wells Fargo & Company (NYSE: WFC) is a leading financial services company that has approximately $2.1 trillion in assets. We provide a diversified set of banking, investment and mortgage products and services, as well as consumer and commercial finance, through our four reportable operating segments: Consumer Banking and Lending, Commercial Banking, Corporate and Investment Banking, and Wealth & Investment Management. Wells Fargo ranked No. 33 on Fortune’s 2025 rankings of America’s largest corporations. Our technology professionals drive innovation, information security, and big data analytics while maintaining a network that handles more than 12 billion customer interactions a year. Join us! Are you looking for more? Find it here. At Wells Fargo, we're more than a financial services leader – we’re a global trailblazer committed to driving innovation, empowering communities, and helping our customers succeed. We believe that a meaningful career is much more than just a job – it’s about finding all of the elements to help you thrive, in one place. Living the Well Life means you’re supported in life, not just work. It means having robust benefits, competitive compensation, and programs designed to help you find work-life balance and well-being. You’ll be rewarded for investing in your community, celebrated for being your authentic self, and empowered to grow. And we’re recognized for it – Wells Fargo once again ranked in the top three – making us the #1 financial services employer – on the 2025 LinkedIn Top Companies list of best workplaces “to grow your career” in the U.S. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other legally protected characteristic. © 2026 Wells Fargo Bank, N.A. All rights reserved. Member FDIC.

Why Work With Us

We're known for our “Well Life” approach to supporting employees’ career aspirations, work-life balance, and mental and physical health. We ranked in the top 3 on the 2025 LinkedIn Top Companies list – and #1 among financial services companies – as the best workplace “to grow your career” in the U.S.

Gallery

Gallery
Gallery
Gallery
Gallery

Wells Fargo Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: 3 days a week
HQSan Francisco, CA
Bangalore, Bangalore
Belfast, GB
Bengaluru, Karnataka
Chandler, AZ
Charlotte, NC
Technology Center
Hyderabad, Telangana
Irving, TX
New York, NY
New York, NY
Phoenix, AZ
Learn more

Similar Jobs

Wells Fargo Logo Wells Fargo

Lead Software Engineer

Fintech • Financial Services
Hybrid
Hyderabad, Telangana, IND
205000 Employees

Wells Fargo Logo Wells Fargo

Senior Systems Operations Engineer

Fintech • Financial Services
Hybrid
Hyderabad, Telangana, IND
205000 Employees
Hybrid
Hyderabad, Telangana, IND
205000 Employees
Hybrid
Hyderabad, Telangana, IND
205000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account