Senior Platform Engineer: Storage

Reposted 15 Days Ago
Be an Early Applicant
Hiring Remotely in New Zealand
Remote
Senior level
Cloud • Information Technology • Software • Infrastructure as a Service (IaaS)
The Role
Design and implement distributed high-reliability storage systems, including Ceph clusters, and develop efficient APIs for internal services.
Summary Generated by Built In

Job description

Our core mission at Railway is to make software engineers higher leverage. We believe that people should be given powerful tools so that they can spend less time setting up to do, and more time doing.

Building the infrastructure which powers the Railway engine is the most core problem at Railway. As an infrastructure engineer working on stoarge, you will be directly responsible for designing software and hardware to back performant, high reliability block storage and object storage systems backing millions of applications. The solutions you build will be instrumental in not only scaling internal operations, but scaling the company to infinity and beyond!

“But the world would be a better place if more engineers, like me, hated technology. The stuff I design, if I'm successful, nobody will ever notice. Things will just work, and will be self-managing”

- Radia Perlman

Want to learn about our work culture? Here is a three-part blog series that will help you see the unique ways our team works (Parts 1, 2, 3, and 4).

About The Role

For this role, you will:

  • Design and evolve multiple production Ceph clusters, from hardware design, to driving network requirements to configuring, tuning and operating clusters and their clients

  • Create efficient, generalizable APIs using systems/kernel features to provide safe, as-fast-as-possible live-migrations of stateful workload between hosts

  • Design and build API and Orchestration services to tie storage primitives to higher level primitives using Go, gRPC, ScyllaDB and Temporal

  • Write Engineering Requirement Documents to take something from idea, to defined tasks, to implementation, to monitoring it’s success

  • Design build a suite of storage primitives that can be used by customer applications, internal services and enable higher level platform features such as streaming image pulls or movable build caches

About You
  • Experience architecting and implementing distributed systems. You enjoy building fault tolerant, resilient, and scalable services

  • Production experience with distributed block device systems (e.g Ceph) or a solid understanding of network storage cluster design from first principles

  • Understanding and experience with current gen filesystems (Ext4, ZFS, BTRFS). Bonus points for next gen (EROFS, bcachefs)

  • A solid intuition about how long your solutions will last. All systems age. In startups, we can hope for 2-3 orders of magnitude, or 12-18mo.

  • The tact to implement your solution, creator monitors for it’s error boundaries, and document any requirements for when you’re not around

  • A great sense of direction and prioritization when it comes to dealing with the ambiguity of an early stage startup

  • A sense of grit to dive into a problem, implement a solution, scale that solution, and replace it when needed

  • A great set of communication skills for getting your point across, solution implemented, and beyond

We value and love to work with diverse persons from all backgrounds

Things to know

For better or worse, we're a startup; our team dynamics are different from companies of different sizes and stages.

  • We're distributed ALL across the globe, and that's only going to be more and more distributed. As a result, stuff is ALWAYS happening.

  • We do NOT expect you to work all the time, but you'll have to be diligent about your boundaries because the end of your day may overlap with the start of someone else's.

  • We're a small team, with high ownership, who are not only passionate about what we do, but seek to be exceptional as well. At the time of writing we're 21, serving hundreds of thousands of users. There's a lot of stuff going on, and a lot of ambiguity.

  • We want you to own it. We believe that ownership is a key to growth, and part of that growth is not only being able to make the choices, but owning the success, or failure, that comes with those choices.

Benefits and perks

At Railway, we provide best in class benefits. Great salary, full health benefits including dependents, strong equity grants, equipment stipend, and much more. For more details, check back on the main careers page.

Beyond compensation, there are a few things that we believe that make working at Railway truly unique:

  • Autonomy: We have very few meetings. Just a Monday and a Friday to go over the Company Board. We think your time is sacred, whether it's at work, or outside of work.

  • Ownership: We're a company with a high ownership, high autonomy culture. We hope that you'll come in, help us, and over the course of many years do the best work of your life. When we bring you onboard, we expect you to change the company.

  • Novel problems/solutions: We're a startup that's well funded, with cool problems, which lets us implement novel solutions! We abhor “busywork” and think, whether it's community, engineering, operations, etc there's always opportunity for creative and high leverage solutions.

  • Growth: We want you to grow with us, but we know that talent is loaned, so when you figure out what area you want to grow in next, whether it's at Railway or outside, we'll make sure you land there.

How we hire

No tricks. No surprises. Here's the entire process:

  1. Talk with us about the role

    • This is completely open ended and we're just trying to see who you are, what you want to do, and where you wanna go.

  2. Work on a small project to discuss in the interview

    • Asynchronously implement the following:

    • Pre-interview: Design a Storage Engine to power something like Railway's Volume

    • You can, and SHOULD! ask us questions ahead of time.

  3. Review your solution with the Team

    1. You'll sit down with someone on the team and go over the above. We'll poke into your solution, as well as get you acquainted with two more members of the team.

      1. Looking for: Learn about your problem solving skills. How you break down a problem and how you present a solution.

    2. Interview Structure (60 Minutes):

      1. Prework (submitted before your interview): Complete your solution

      2. 0-5m: introduction

      3. 5-50m: Building (or expanding) your solution

      4. 50-60m: Questions on Railway/Tech/etc

  4. Meet the Team

    1. You'll meet the Team, which will be comprised of 4 people from vastly different sections of the company.

      1. Looking for: How you work with the rest of the team and communicate.

  5. Offer and Details Chat with CEO

    1. Finally, we will go over the process, the role, and hammer out the details about your position, onboarding, and all the deets.

#Global

Skills Required

  • Experience architecting and implementing distributed systems
  • Production experience with distributed block device systems (e.g Ceph)
  • Understanding and experience with current gen filesystems (Ext4, ZFS, BTRFS)
  • Ability to design fault tolerant and resilient storage services
  • Solid communication skills for implementing solutions
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
50 Employees
Year Founded: 2020

What We Do

Railway aims to make developers orders of magnitude more efficient by providing a platform that simplifies deploying logic without servers and building infrastructure automation tools. They also focus on making it simple to ship anything by building global datacenters.

Similar Jobs

Remote
New Zealand
6273 Employees

Airwallex Logo Airwallex

(Senior) Manager, Regulatory Compliance, New Zealand

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Remote or Hybrid
Auckland, NZL
2200 Employees

Halter Logo Halter

Account Manager

Greentech • Hardware • Internet of Things • Machine Learning • Software • Business Intelligence • Agriculture
Remote
Taranaki, NZL
350 Employees

Halter Logo Halter

Junior Collar Technician

Greentech • Hardware • Internet of Things • Machine Learning • Software • Business Intelligence • Agriculture
Remote
New Zealand
350 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account