Site Reliability Engineer

Reposted 19 Days Ago
Be an Early Applicant
Indonesia
Mid level
Cloud • Gaming • Software
The Role
The Site Reliability Engineer maintains service reliability, designs infrastructure, implements automation, monitors systems, and collaborates with stakeholders.
Summary Generated by Built In

At AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox Live, PlayStation Network, and EA Origin. We are backed by top investors including Softbank, Sony Interactive Entertainment, Galaxy Interactive, NetEase, and Krafton. Our latest Series B funding has firmly solidified our place as a top player in the gaming industry.  AccelByte’s talent has decades of experience building and shipping some of the largest game and distribution platforms in the world.

We believe that the best companies empower employees to make decisions, obsess about the best user experience, and are not afraid to make and learn from their mistakes. Our culture is based on humility, openness to feedback, drive, and collaboration, which we feel results in the best performing teams.  As a company that values diversity, inclusion, and employee growth, our employees have opportunities to work with and learn from teams all over the world.  We offer competitive salaries, a full range of health benefits, social activities, career growth opportunities, and an amazing team. Come join us!

POSITION SUMMARY:

AccelByte is building a 24x7 operations team for AAA multiplayer video games. In this position, we need a driven Site Reliability Engineer who can actively participate in the day-to-day combat by maintaining high reliability of our service and drive prioritization in fixing what may be broken today, as well as able to envision, design and implement processes and technologies to improve the ability to identify, isolate, correlate, and mitigate service impacting problems in the system. The Site Reliability Engineer must also know some coding to automate routine tasks in service metrics gathering, correlating, organizing, and presenting, in addition to detail and in-depth root cause analysis


ESSENTIAL FUNCTIONS/RESPONSIBILITIES:

The Site Reliability Engineer (SRE) is accountable for the following functions and responsibilities:


  • Design, implement, and maintain infrastructure for applications
  • Build and run service deployment using K8s and other CNCF projects
  • Provide a secure, high-scalable, and cost-effective cloud platform
  • Construct and build effective systems to monitor the health of our system/applications, and to handle outages
  • Solve problems occurring in all our environments and create solutions to prevent them from happening again
  • Produce automation and innovative tools to assist the product development teams and to deliver operational excellence
  • Create and maintain infrastructure related documentations and SRE runbooks 
  • Collaborate with other stakeholders to provide cost-effective, operational excellence, and performance efficient infrastructure solutions to improve our products.
  • Identify technology, process gaps, and opportunities for improvement
  • Liaise, communicate, and work directly with our client.

QUALIFICATIONS/EXPERIENCE REQUIRED

  • 2+ years Linux administration
  • Degree in Computer Science or equivalent experience
  • Prior experience helping design, manage and run large scale applications in the cloud
  • Experience with monitoring systems and strategies (System Admin)
  • Solid performance and troubleshooting skills
  • Solid foundation on distributed system
  • Robust knowledge and experience in cloud computing of at least one cloud provider (preferred AWS/GCP)
  • Experience with containerization principles and frameworks such as Docker, Container, Kubernetes, etc
  • Proven track record of building infrastructure as code (Terraform is must), configuration management, and package manager (eg: Helm Chart)
  • Proven experience with automation, CICD, and GitOps tools such as Jenkins, GitLab, GitHub, Flux, and/or ArgoCD
  • Experience with monitoring and alerting tools such as Prometheus, Grafana, ELK/EFK, Splunk, Datadog, OpsGenie, PagerDuty, etc
  • Experience within a greenfield environment, building infrastructure from scratch
  • Software development and scripting experience with Bash, Python, and/or Golang
  • Ability to work with clients on tight deadlines and fluid requirements 
  • Good communication skill (escalation, explaining the incident)
  • Fluent in English both spoken and written
  • Willing to work on shift (24/7)

QUALIFICATIONS/EXPERIENCE PREFERRED

  • Contribute to open source projects and participate in technical communities
  • Experience working for or with AAA game studios
  • JVM tuning and troubleshooting
  • Experience with web services
  • Experience in Networking, Security, or Storage
  • Experience managing SQL and NoSQL databases
  • Familiar with Perforce version control

AccelByte Inc is an Equal Employment Opportunity Employer, all qualified candidates and applicants will receive consideration for employment without regard to race, religion, gender, national origin, sexual orientation, marital status, age, or disability. Our culture is innovative and inclusive, and we value our people the highest.

Please visit our career page for a complete listing of our open positions https://accelbyte.io/careers

Top Skills

Argocd
AWS
Bash
Datadog
Docker
Elk
Flux
GCP
Git
Gitlab
Go
Grafana
Jenkins
Kubernetes
Linux
Prometheus
Python
Splunk
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Seattle, WA
353 Employees
Year Founded: 2016

What We Do

AccelByte provides a comprehensive white-label backend for game studios to develop, publish, and operate games-as-a-service. Leveraging years of combined experience running large scale game services, we built our technology from scratch to bring AAA-pedigree game services to everyone. We help game studios enable cross-platform play & progression and seamlessly allow crossover between titles. We are a one-stop-shop for online game services.

Similar Jobs

DKATALIS Logo DKATALIS

Senior Site Reliability Engineer

Information Technology • Consulting
In-Office
DKI Jakarta, IDN
30K-80K

Kraft Heinz Logo Kraft Heinz

GA Supervisor

Big Data • Cloud • Food • Machine Learning • Software • Database • Analytics
Hybrid
DKI Jakarta, IDN

Integral Ad Science Logo Integral Ad Science

Customer Success Associate

AdTech • Big Data • Digital Media • Marketing Tech
Easy Apply
Hybrid
Indonesia

Workiva Logo Workiva

Customer Success Manager

Artificial Intelligence • Cloud • Fintech • Professional Services • Software • Analytics • Financial Services
In-Office or Remote
17 Locations
93K-149K Annually

Similar Companies Hiring

Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account