Senior Reliability Engineer (Data Infrastructure)

Posted 5 Days Ago
Be an Early Applicant
Lisbon
Hybrid
Senior level
Fintech • Information Technology • Security
The Role
As a Senior Site Reliability Engineer, you'll enhance the reliability of data infrastructure across AWS and GCP, design service architectures, respond to incidents, manage cloud infrastructure, optimize performance, and collaborate with cross-functional teams.
Summary Generated by Built In

What you will be doing:

We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our Data Infrastructure team. You will be responsible for ensuring the reliability, availability, and performance of our critical data systems running on AWS and GCP. Your expertise in cloud infrastructure, automation, and operational excellence will be crucial in supporting our Product trough our global client base.

As a Senior Site Reliability Engineer you will:

  • Design, implement, and maintain highly available and reliable data infrastructure services, including SQL, NoSQL, Kafka, and Spark-based data layers. Define and monitor Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
  • Participate in an on-call rotation to respond to incidents and ensure rapid resolution of production issues. Conduct thorough post-incident reviews to identify root causes and implement preventative measures.
  • Manage and automate cloud infrastructure using Terraform and Helm, adhering to GitOps principles.
  • Implement and maintain comprehensive monitoring, logging, and tracing solutions to proactively identify and resolve performance and reliability issues.
  • Monitor and manage data infrastructure capacity, plan for future growth, and optimize performance through tuning and automation.
  • Develop and maintain automation scripts and tools to streamline operational tasks, improve efficiency, and reduce manual effort.
  • Ensure the security and compliance of data infrastructure services, implementing best practices for access control, data protection, and vulnerability management.
  • Collaborate with development and data engineering teams to ensure smooth deployments and operational support. Maintain thorough documentation of infrastructure configurations, processes, and procedures.
  • Manage and maintain distributed databases running within a Kubernetes environment.

Our Tech Stack:

  • Cloud-Based Infrastructure: Fully cloud-based with a Kubernetes-focused tech stack. Compute workloads run in Kubernetes clusters across multiple regions.
  • Infrastructure Management: Heavy use of Terraform and Helm, adhering to GitOps paradigms for managing cloud infrastructure and Kubernetes applications.
  • Core Technologies: Extensive use of Kafka, distributed PostgreSQL and Cassandra QL, Elasticsearch, and Databricks/Spark. Development of inter-cloud failover options to support multi-cloud plans.
  • Wide Array of Applications: Teams build and release containerised applications for low latency APIs, machine learning models, and data processing pipelines.

About You:

  • Experience as an SRE managing cloud infrastructure (AWS and/or GCP) and data systems (Apache Kafka, Apache Spark, Elasticsearch, PostgreSQL, Cassandra). Proven track record of improving reliability and availability in complex production environments.
  • Extensive experience codifying infrastructure using Terraform and Helm charts.
  • Proven experience managing and troubleshooting distributed databases within Kubernetes.
  • Deep understanding of monitoring, logging, and tracing tools and techniques.
  • Strong incident response and troubleshooting skills.
  • Proficiency in scripting and automation tools.
  • Understanding of security best practices for cloud infrastructure and data systems.
  • Familiarity with CI tooling, test pipelines, and asset generation (e.g., Docker images, Helm charts). Understanding of security considerations in data systems.

Education:

  • BSc/BA degree in computer science, engineering, or related discipline OR equivalent experience in required skills.

Nice to have

  • Familiarity with distributed SQL and NoSQL databases such as Yugabyte, Cockroach, Spanner, HBase, or CouchDB.
  • Familiarity with data modelling, sharding, and indexing strategies for large-scale databases.

What’s in it for you? 

  • Equity as we want you to have a part of what we are building 
  • Private medical insurance designed to keep you ensuring peace of mind while you excel in your career
  • Unlimited Time Off Policy- A work-life balance and focus on our well-being are critical to keeping us performing at our best 
  • We embrace a hybrid approach that requires employees to be in the office for two days a week. We strongly believe that this approach fosters collaboration and enables the building of meaningful relationships
  • You will also get a new starter budget to kit out your home office 
  • Opportunity to work on innovative projects with smart-minded people keen to share their knowledge and continuously improve 
  • Annual learning budget (prorated based on start date) to drive your performance and career development 

About us:

Our mission is to empower every business to eliminate financial crime. 

By harnessing AI, a unified platform, and an extensive partner ecosystem, we help customers turn compliance into a catalyst for growth, operational resilience, and enduring regulatory trust.

More than 3,000 enterprises across 75 countries rely on our end-to-end platform and the world’s most comprehensive financial crime risk intelligence. With full-stack agentic automation, we help organizations automate up to 95% of KYC, AML, and sanctions reviews, cut onboarding times by 50%, reduce false positives by 70%, and handle 7x more work with the same staff.

ComplyAdvantage is headquartered in London and has global hubs in New York, Lisbon, Singapore, and Cluj-Napoca. It is backed by Balderton Capital, Index Ventures, Ontario Teachers’ Pension Plan, Goldman Sachs, and Andreessen Horowitz. Learn more about compliance re-engineered for the age of AI at complyadvantage.com.

Top Skills

AWS
Cassandra
Databricks
Elasticsearch
GCP
Helm
Kafka
Kubernetes
NoSQL
Postgres
Spark
SQL
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: London
483 Employees
Year Founded: 2014

What We Do

At ComplyAdvantage, we believe that compliance doesn’t have to be painful. Businesses need real-time financial crime insight to put them in control.

We enable you to understand the real risk of who you're doing business with, through the world's only global, real-time risk database of people and companies. We actively identify tens of thousands of risk events from millions of structured and unstructured data points - every single day.

Our suite of configurable cloud services integrates seamlessly to help automate and reduce the frustration of complying with Sanctions, AML and CTF regulations.

Gallery

Gallery

Similar Jobs

Mastercard Logo Mastercard

Senior Specialist, Product Management - Mastercard Research Center

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Hybrid
Lisbon, PRT
38800 Employees

Cloudflare Logo Cloudflare

Business Development Representative

Cloud • Information Technology • Security • Software • Cybersecurity
Hybrid
3 Locations
4400 Employees

Datadog Logo Datadog

Senior Software Engineer

Artificial Intelligence • Cloud • Security • Software • Cybersecurity
Easy Apply
Hybrid
Lisbon, PRT
6500 Employees

Cloudflare Logo Cloudflare

Product Manager

Cloud • Information Technology • Security • Software • Cybersecurity
Hybrid
3 Locations
4400 Employees
141K-194K Annually

Similar Companies Hiring

Rain Thumbnail
Web3 • Payments • Infrastructure as a Service (IaaS) • Fintech • Financial Services • Cryptocurrency • Blockchain
New York, NY
80 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account