Staff Site Reliability Engineer

Posted 23 Days Ago
Be an Early Applicant
San Diego, CA, USA
In-Office
199K-299K Annually
Expert/Leader
Gaming
The Role
The Staff Site Reliability Engineer manages the availability and resilience of the monetization platform, driving automation, process improvements, and collaboration across teams. Responsibilities include application management, operational excellence, monitoring, incident resolution, and performance analysis in an AWS environment.
Summary Generated by Built In

Why Sony Interactive Entertainment?

Sony Interactive Entertainment isn’t just the Best Place to Play — it’s also the Best Place to Work. Sony Interactive Entertainment (SIE) is the company behind the PlayStation brand. As a subsidiary of Sony Group Corporation, we’re part of a proud legacy of innovation and excellence. SIE is a dynamic technology company, delivering cutting-edge hardware and network services to more than 100 million people and an entertainment leader, home to some of the most beloved and recognizable intellectual properties (IP) in the world. Our role at SIE is to create and nurture the experiences under the PlayStation brand, a name synonymous with entertainment excellence and creativity.

Staff Site Reliability Engineer

San Diego, CA

As a key leader of the Commerce - Technical Operations team, you will help drive the availability and enablement for PlayStation Store, Catalog, Entitlement, Pricing and Device Management platforms. You will partner closely with product engineering teams to deliver innovative player features and elevate operational excellence for millions of players worldwide.

Responsibilities:

Your responsibilities will include hands-on application management within a public cloud environment, ensuring availability, resiliency, scalability and performance. You will work side by side with our service development teams to develop, automate and ensure the production readiness of all new services and features introduced.

  • Review and influence service architecture and system design to improve resiliency, fault tolerance, and scalability. Establish and promote best practices across engineering teams.
  • Own and evolve Infrastructure-as-Code for managed services (AWS, GCP). Design and build scalable, reusable modules and automation that standardize provisioning, configuration, and operations
  • Improve and develop production-grade automation, and tooling to improve measureable outcomes in operational process, manual toil, MTTx reduction, etc.
  • Increase observability across the platform by implementing robust monitoring, logging, and tracing patterns. Build actionable dashboards and define meaningful alerting strategies that reduce MTTD and MTTR while minimizing noise.
  • Leverage service telemetry and historical data to anticipate capacity needs, detect anomalous behavior, and proactively prevent incidents. Develop data-informed approaches to performance optimization and reliability engineering.
  • Lead performance and capacity planning initiatives. Apply cloud-native patterns (e.g., auto-scaling, spot capacity, container orchestration with EKS) to optimize cost, performance, and availability at scale.
  • Contribute code to shared repositories and platform components improving reliability, scalability, and maintainability.
  • Collaborate across SIE with a variety of engineering, product, security and PMO teams to drive reliability improvements and ensure consistent operational standards across PlayStation services.
  • Contribute to and help evolve reliability engineering practices, including SLIs/SLOs, error budgets, and operational readiness standards.
  • Provide rotational on-call support, including incident detection, triage, and resolution for production systems, with a focus on continuous improvement of system reliability.
  • Lead post-incident reviews, producing clear root cause analyses and driving follow-through on corrective and preventative actions across teams.
Qualifications:
  • BS degree in Computer Science, Engineering, or related technical subject area.
  • 7+ years hands-on AWS experience – integrating, developing and managing applications
  • 10+ years of relevant SRE or operational work experience supporting a high-volume and/or critical production, software environment
  • 10+ years of hands on software engineering or systems engineering experience (Java and/or React services)
  • 5+ years of experience with building automation into daily operational processes through one or more programming languages (preferably Python or Go).
  • Hands-on experience using modern AI engineering technologies, including LLM models, MCP-based integrations, and agentic workflow patterns, to improve SRE Operations.
  • Strong experience in configuring, tuning and automating operational responsibilities for AWS managed data services including RDS, DynamoDB and Elasticache
  • Experience with monitoring and log management tools (ie: DataDog, CloudWatch, Grafana, Splunk)
  • Experience with container technologies and orchestration (ie: Docker, Kubernetes, EKS)
  • Hands-on experience in triaging and tuning Java cloud applications with integration into AWS
  • Solid understanding of AWS networking systems and protocols (ie: ALB, R53, API-Gateway, TCP/IP, HTTP/HTTPS, DNS)
  • Experience with developing or support Continuous Integration and Continuous Delivery/Deployment pipelines (CI/CD)
  • Excellent leadership presence, verbal and written communication

#LI-KS1

Please refer to our Candidate Privacy Notice for more information about how we process your personal information, and your data protection rights.


At SIE, we consider several factors when setting each role’s base pay range, including the competitive benchmarking data for the market and geographic location.

Please note that the base pay range may vary in line with our hybrid working policy and individual base pay will be determined based on job-related factors which may include knowledge, skills, experience, and location. 

In addition, this role
is eligible for SIE’s top-tier benefits package that includes medical, dental, vision, matching 401(k), paid time off, wellness program and coveted employee discounts for Sony products. This role also may be eligible for a bonus package. Click here to learn more.


The estimated base pay range for this role is listed below.
$199,400$299,200 USD

Please note, Sony Interactive Entertainment conducts background checks at the offer stage for all new employees (which may include criminal background checks for some roles) and will need to process personal information to support these checks.

Please refer to our Candidate Privacy Notice for more information about what personal information we collect, how we use it, who we share it with, and your data protection rights.

Equal Opportunity Statement:

Sony is an Equal Opportunity Employer. All persons will receive consideration for employment without regard to gender (including gender identity, gender expression and gender reassignment), race (including colour, nationality, ethnic or national origin), religion or belief, marital or civil partnership status, disability, age, sexual orientation, pregnancy, maternity or parental status, trade union membership or membership in any other legally protected category.

We strive to create an inclusive environment, empower employees and embrace diversity. We encourage everyone to respond. 

Sony Interactive Entertainment is a Fair Chance employer and qualified applicants with arrest and conviction records will be considered for employment.


Skills Required

  • BS degree in Computer Science, Engineering, or related technical subject area
  • 7+ years hands-on AWS experience
  • 10+ years of relevant work experience
  • 10+ years of software or systems engineering experience (Java and/or C++)
  • 5+ years experience in process automation (Python or Go preferred)
  • Strong experience with AWS managed data services
  • Experience with monitoring and log management tools
  • Experience with container technologies and orchestration
  • Hands-on experience triaging Java cloud applications
  • Solid understanding of AWS networking protocols
  • Experience with CI/CD pipelines
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Aliso Viejo, CA
8,768 Employees
Year Founded: 1994

What We Do

PlayStation has been at the forefront of interactive and digital entertainment since the debut of our first console in 1994. Our products delight millions across the world through incredible games, cutting edge experiences and access to many types of media. This commitment to amazing our fans is at the core of who we are and one we share with Sony Corporation, internationally known as a leader in music, movies and consumer electronics. We can only achieve this goal by welcoming talented people and empowering them to do their best work. From game developers to data scientists, software engineers to cybersecurity experts, marketing to accounting and finance professionals, we’re always looking for talented people who share a passion for creating and our commitment to delivering amazement.

Similar Jobs

Zscaler Logo Zscaler

Site Reliability Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Hybrid
San Jose, CA, USA
8697 Employees
119K-170K Annually

ServiceNow Logo ServiceNow

Site Reliability Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
28000 Employees
166K-290K Annually

GRAIL Logo GRAIL

Site Reliability Engineer

Artificial Intelligence • Big Data • Healthtech • Machine Learning • Software • Biotech
Hybrid
Menlo Park, CA, USA
918 Employees
169K-224K Annually

MongoDB Logo MongoDB

Site Reliability Engineer

Big Data • Cloud • Software • Database
Easy Apply
Remote or Hybrid
10 Locations
5550 Employees
127K-249K Annually

Similar Companies Hiring

DraftKings Thumbnail
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Boston, MA
6400 Employees
bet365 Thumbnail
Digital Media • Gaming • Software • Esports • Automation
Denver, Colorado
9000 Employees
ARB Interactive Thumbnail
Gaming • Software
Miami, Florida
175 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account