Senior Site Reliability Engineer

Reposted 3 Days Ago
Be an Early Applicant
2 Locations
Remote
Senior level
Gaming • Hardware
The Role
The Senior Site Reliability Engineer will design and maintain Infrastructure as Code solutions, enhance cloud infrastructure, lead incident responses, and mentor junior engineers.
Summary Generated by Built In

Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put you in an accelerated growth, both personally and professionally.

Job Responsibilities :

We are seeking a skilled and driven Senior Site Reliability Engineer (SRE) to join our growing infrastructure and platform engineering team. The ideal candidate will have hands-on experience in Amazon Web Services (AWS), strong troubleshooting capabilities, and a passion for building scalable, observable, and resilient systems using modern Infrastructure as Code (IaC) and automation tools.

REQUIREMENTS:

  • Bachelor’s degree in Computer Science, Software Engineering, Information Technology, or a related field.
  • Minimum 3 years of experience in SRE, DevOps, cloud infrastructure, or system administration roles.
  • Hands-on expertise with AWS Cloud Services, including:
  • Compute & Containerization: EC2, Lambda, ECS, EKS, Auto Scaling
  • Networking: Load Balancers, VPC, Route 53, Security Groups, Firewalls
  • Storage & Databases: RDS, ElastiCache, Athena, S3
  • Messaging: SQS, SES
  • Deep understanding of Infrastructure as Code (IaC) tools such as Terraform and CloudFormation.
  • Proficiency in at least one programming/scripting language: Python, Node.js, Bash, Ruby, or related.
  • Experience operating and troubleshooting across Linux, Windows, and container-based environments.
  • Strong understanding of distributed systems, cloud networking (routers, switches), firewalls, DNS, and HTTP/TLS.
  • Experience implementing monitoring and alerting systems and working with incident management processes.
  • Experience with Zero Downtime Deployments, blue/green or canary deployments.
  • Familiarity with cost optimization and right-sizing AWS resources.
  • Exposure to multi-region, multi-account AWS architecture.
  • Understanding of API gateway, or edge networking (e.g., Akamai, CloudFront).

JOB DESCRIPTION:

  • Design, implement, and maintain Infrastructure as Code (IaC) solutions using Terraform and/or CloudFormation across multi-account AWS environments.
  • Collaborate with developers, architects, and DevOps teams to build scalable, secure, and observable cloud infrastructure.
  • Lead and participate in architecture design sessions, focusing on system reliability, scalability, security, and performance.
  • Implement and manage robust monitoring, alerting, and observability solutions (e.g., CloudWatch, Prometheus, ELK, Datadog).
  • Set and monitor Key Performance Indicators (KPIs) for system uptime, latency, throughput, and overall reliability.
  • Drive incident response processes, including coordination, triaging, resolution, documentation, and post-incident reviews (PIRs).
  • Supervise and mentor junior SREs and infrastructure engineers, fostering knowledge-sharing and team growth.
  • Collaborate across development, operations, and security teams to ensure secure and compliant deployments.
  • Automate manual tasks and workflows through scripting and tooling (Python, Node.js, Bash, Ruby, JSON/YAML).
  • Troubleshoot complex infrastructure issues across Linux, Windows, Docker, and cloud-native environments.
  • Provide IaC and CI/CD best practices to ensure repeatability, scalability, and compliance across all environments.
  • Provide on-call support, participate in incident rotations, and lead technical investigations during outages or degradations.
  • Strong understanding and experience for Disaster Recovery (DR).
  • Provide support and solution handling to incident and tickets assigned.

Pre-Requisites :

Are you game?

Top Skills

Amazon Web Services (Aws)
Bash
CloudFormation
Cloudwatch
Datadog
Docker
Elk
Linux
Node.js
Python
Ruby
Terraform
Windows
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
1,383 Employees
Year Founded: 2005

What We Do

Razer™ is the world’s leading lifestyle brand for gamers.

The triple-headed snake trademark of Razer is one of the most recognized logos in the global gaming and esports communities.

With a fan base that spans every continent, the company has designed and built the world’s largest gamer-focused ecosystem of hardware, software and services.

Razer’s award-winning hardware includes high-performance gaming peripherals and Blade gaming laptops. Razer’s software platform, with over 70 million users, includes Razer Synapse (an Internet of Things platform), Razer Chroma™ (a proprietary RGB lighting technology system), and Razer Cortex (a game optimizer and launcher).

In services, Razer Gold is one of the world’s largest virtual credit services for gamers, and Razer Fintech is one of the largest online-to-offline digital payment networks in SE Asia.

Founded in 2005 and dual-headquartered in Irvine and Singapore, Razer has 18 offices worldwide and is recognized as the leading brand for gamers in the USA, Europe and China. Razer is listed on the Hong Kong Stock Exchange (Stock Code: 1337).

Similar Jobs

Omilia Logo Omilia

Senior Site Reliability Engineer

Artificial Intelligence • Conversational AI
Remote
6 Locations
354 Employees

SailPoint Logo SailPoint

Senior Consultant

Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Remote or Hybrid
Philippines
2461 Employees

CrowdStrike Logo CrowdStrike

Sales Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
Philippines
10000 Employees

Capital One Logo Capital One

Unit Manager, Commercial Clients Operations

Fintech • Machine Learning • Payments • Software • Financial Services
Remote or Hybrid
City of Muntinlupa, Rizal, Calabarzon, PHL
55000 Employees

Similar Companies Hiring

Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
155 Employees
Blissway Thumbnail
Transportation • Software • Machine Learning • Internet of Things • Hardware • Fintech • Computer Vision
Denver, Colorado
20 Employees
Turion Space Thumbnail
Software • Manufacturing • Information Technology • Hardware • Defense • Artificial Intelligence • Aerospace
Irvine, CA
150 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account