Site Reliability / Infrastructure Engineer

Reposted 9 Days Ago
New York City, NY, USA
In-Office
180K-275K Annually
Mid level
Gaming • Mobile • Software
The Role
The Site Reliability Engineer will focus on reliability, scaling infrastructure on GCP, managing incident responses, architecting database strategies, and enhancing observability across systems.
Summary Generated by Built In
About Medal

Millions of gamers capture, share, and discover new games on Medal, the largest platform for gaming clips. Our mission is to design products that make sharing and connecting around gaming seamless and fun, and build a place where brands and game publishers can reach high-quality gamers to grow their products. Medal's data powers General Intuition, the frontier research lab that recently raised $320M at a $2.3B valuation led by Khosla Ventures with participation from General Catalyst, Eric Schmidt, and Jeff Bezos.

The Role

Medal's infrastructure handles billions of clips, video ingestion pipelines, and social features at a massive scale most engineers never get to touch. The work centers on reliability, incident response, scaling, and making sure our infrastructure keeps up with our growth. You'll own the on-call rotation, drive postmortems, and work directly with engineering teams to meet their infra needs. The right person probably came through startups and scale-ups, has been in the room when things broke at 2am, has scaled databases under pressure, and knows the difference between a durable fix and a patch that buys you a week.

What We're Looking For
  • Infrastructure-as-code: Strong fluency in Terraform, with real experience owning infrastructure-as-code at scale

  • Elasticsearch depth: Hands-on experience running ES for user-facing features, not just as a log sink

  • GCP depth: You know it maybe a little too well: Kubernetes, VPC, IAM, Cloud Logging, and the managed services ecosystem

  • Database scaling: Deep, hands-on experience scaling and sharding relational databases (MySQL, Postgres) in production

  • Incident response instincts: You can work a P0 calmly, communicate clearly under pressure, and run a postmortem that prevents recurrence

  • CI/CD: You've worked with GitHub Actions in a production environment

  • Communication (crucial!): You flag issues clearly and rapidly during incidents and lead/write actionable postmortems

  • Experience at startups: You are comfortable in an environment of rapid growth where scaling up is a priority

  • Great judgment: You know the difference between a durable, sustainable fix and a patch that buys you a week

Our Stack

Electron, React, Redux, Styled Components & other modern web-based technologies
C# and C++ for native windows recording & more
Swift for iOS, Kotlin for Android
Java, Redis, RabbitMQ, Kubernetes for backend
Terraform, Salt, GitHub Actions, CircleCI for IaC and CI/CD

Skills Required

  • Deep hands-on experience scaling and sharding relational databases
  • Experience with GCP services including Kubernetes, IAM, and logging
  • Fluency in Terraform and operation of infrastructure-as-code
  • Strong incident response instincts in production settings
  • Experience with GitHub Actions in a CI/CD context
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Naarden
57 Employees
Year Founded: 2015

What We Do

About Us: Create gaming memories while apart — Medal enables you to reliably capture and meaningfully share online memories with friends (..that would otherwise be lost to time).

Similar Jobs

New York Life Insurance Company Logo New York Life Insurance Company

Site Reliability Engineer

Artificial Intelligence • Cloud • Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Hybrid
New York, NY, USA
12000 Employees
112K-159K Annually

Andromeda (andromeda.ai) Logo Andromeda (andromeda.ai)

Site Reliability Engineer

Artificial Intelligence • Cloud • Information Technology • Software
In-Office or Remote
3 Locations
17 Employees

CoreWeave Logo CoreWeave

Senior Site Reliability Engineer

Cloud • Information Technology • Machine Learning
In-Office
2 Locations
1450 Employees
165K-242K Annually

MongoDB Logo MongoDB

Site Reliability Engineer

Big Data • Cloud • Software • Database
Easy Apply
Remote or Hybrid
5 Locations
5550 Employees
127K-249K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account