Senior Site Reliability Engineer

Reposted Yesterday
Be an Early Applicant
London, Greater London, England, GBR
In-Office
Senior level
Fintech • Software
The Role
As a Senior Site Reliability Engineer, you will maintain production resilience, implement observability technologies, lead incident response, and develop automation for operational improvement.
Summary Generated by Built In

As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries. Some 20,000 financial services and healthcare organizations, from the world's largest companies to small and mid-market firms, rely on SS&C for expertise, scale, and technology.

Job Description

SS&C is a global financial technology and software-enabled services company that provides mission-critical solutions primarily to the financial services and healthcare industries. It is headquartered in Windsor, Connecticut, USA, and is publicly listed on the NASDAQ. SS&C is widely regarded as one of the largest administrators of hedge fund and private equity operations and the largest mutual fund transfer agency globally.
 

About the role:

Operated within the SS&C WIT business, Genesis is an all-new investment operations platform that provides extensive asset class and functional support across the front, middle, and back office. Built natively for the cloud with advanced technology, Genesis features an innovative user experience, actionable monitors, notifications, and alerts infused with AI.
The role requires an in-depth knowledge of observability principles and strong experience in implementing the observability stack across infrastructure, data and application layers for real time, compute intensive, distributed environments. The Senior SRE Engineer will have a solid understanding of cloud platforms and container orchestration. They will have a comprehensive grasp of incident management and operational risk mitigation and experience in implementing automation frameworks to minimize toil and reduce MTTD/MTTR. They will have proven experience in using infrastructure as code and familiarity with AI-driven operational tooling. Logical thinkers with strong problem solving and communication skills and a desire to effect continuous improvements.
 

Your Responsibilities:

  • Maintain shared ownership for providing production level resilience and reliability for business-critical systems.
  • Leverage industry-standard observability technologies to provide a centralized view of system and service health.
  • Implement and continually improve monitoring and alerting based on harvested logs, metrics and traces.
  • Lead incident response, post incident reviews and post remediation improvements.
  • Define and establish KPIs, SLIs and SLOs in support of agreed service levels.
  • Develop and maintain automation, and leverage generative AI technologies to reduce operational toil, improve MTTD and MTTR.
  • Take on new support for additional technical service components as the service evolves. Support, mentor and train SRE Engineers.
  • Work with other teams to maintain a sound knowledge of all aspects of the application technical architecture.
  • Contribute to building up and maintaining a knowledge base in support of the technical role.
  • Maintain and awareness of, comply with and champion the stated service controls required to achieve audit compliance.
     

Your Experience:

  • Bachelor’s degree in Computer Science, Software Engineering, or a related field.
  • ITIL foundation level or experience working in an ITIL framework preferred.
  • 4+ years of Linux OS and Windows OS systems management experience.
  • 4+ years of experience with observability technologies for system monitoring and alerting technologies (e.g. Prometheus, Grafana, Loki).
  • 2+ years working in a team environment with operational responsibilities for client facing applications.
  • 2+ years of experience with containerization technologies and Kubernetes.
  • Proven scripting skills in at least one of Linux shell scripting (csh, ksh, Bash or Windows PowerShell), Ansible, Terraform or Python.
  • Working experience in use of versatile workload automation / enterprise scheduling tools such as Airflow.
  • Working experience with, and a technical understanding of, NoSQL DBs such as MongoDB/Cassandra and traditional relational DBs such as SQL Server/Oracle/Postgre.
  • Working experience of a cloud self-service environment.
  • Working experience of LLM or AI usage in monitoring and observability stacks.
     

EEO Statement / Non-agency Disclosure We encourage applications from people of all backgrounds and particularly welcome applications from under-represented groups, to enable us to bring a diversity of perspectives to our thinking and conversation. It's important to us that we strive to have a workforce that is diverse in the widest sense.

Unless explicitly requested or approached by SS&C Technologies, Inc. or any of its affiliated companies, the company will not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services.

Unless explicitly requested or approached by SS&C Technologies, Inc. or any of its affiliated companies, the company will not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services.

SS&C Technologies is an Equal Employment Opportunity employer and does not discriminate against any applicant for employment or employee on the basis of race, color, religious creed, gender, age, marital status, sexual orientation, national origin, disability, veteran status or any other classification protected by applicable discrimination laws.

Top Skills

AI
Airflow
Ansible
Cassandra
Grafana
Kubernetes
Linux
Loki
MongoDB
Oracle
Prometheus
Python
SQL Server
Terraform
Windows
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Windsor, CT
22,000 Employees
Year Founded: 1986

What We Do

SS&C is a global provider of investment and financial services and software for the financial services and healthcare industries. Named to Fortune 1000 list as top U.S. company based on revenue, SS&C is headquartered in Windsor, Connecticut and has 22,000+ employees in over 150 offices in 35 countries. Some 18,000 financial services and healthcare organizations, from the world's largest institutions to local firms, manage and account for their investments using SS&C's products and services.

Similar Jobs

Circle Logo Circle

Senior Site Reliability Engineer

Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
In-Office or Remote
10 Locations
1050 Employees

Thought Machine Logo Thought Machine

Senior Site Reliability Engineer

Fintech • Software • Financial Services
Hybrid
London, Greater London, England, GBR
617 Employees

Wayve Logo Wayve

Senior Site Reliability Engineer

Artificial Intelligence • Transportation
In-Office
London, Greater London, England, GBR
200 Employees

Credit Karma Logo Credit Karma

Site Reliability Engineer

Fintech • Payments • Productivity • Financial Services
Easy Apply
Hybrid
London, England, GBR
1320 Employees

Similar Companies Hiring

Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account