Site Reliability Engineer

Posted 3 Days Ago
Be an Early Applicant
26 Locations
Remote
Mid level
Fintech • Information Technology
The Role
Operate and improve Alpaca's production infrastructure: on-call incident response, define SLIs/SLOs, enhance observability, ship infrastructure as code via GitOps, and strengthen PostgreSQL reliability (performance, migrations, HA/DR). Mentor teams on reliability and database fundamentals.
Summary Generated by Built In

Who We Are:

Alpaca is a US-headquartered self-clearing broker-dealer and brokerage infrastructure for stocks, ETFs, options, crypto, fixed income, 24/5 trading, and more. Our recent Series D funding round brought our total investment to over $320 million, fueling our ambitious vision.

Amongst our subsidiaries, Alpaca is a licensed financial services company, serving hundreds of financial institutions across 40 countries with our institutional-grade APIs. This includes broker-dealers, investment advisors, wealth managers, hedge funds, and crypto exchanges, totalling over 9 million brokerage accounts.

Our global team is a diverse group of experienced engineers, traders, and brokerage professionals who are working to achieve our mission of opening financial services to everyone on the planet. We're deeply committed to open-source contributions and fostering a vibrant community, continuously enhancing our award-winning, developer-friendly API and the robust infrastructure behind it.

Alpaca is proudly backed by top-tier global investors, including Portage Ventures, Spark Capital, Tribe Capital, Social Leverage, Horizons Ventures, Unbound, SBI Group, Derayah Financial, Elefund, and Y Combinator.


Our Team Members:

We're a dynamic team of 380+ globally distributed members who thrive working from our favorite places around the world, with teammates spanning the USA, Canada, Japan, Hungary, Nigeria, Brazil, the UK, and beyond!
We're searching for passionate individuals eager to contribute to Alpaca's rapid growth. If you align with our core values—Stay Curious, Have Empathy, and Be Accountable—and are ready to make a significant impact, we encourage you to apply.

Your Role:

As a Site Reliability Engineer at Alpaca, you'll help keep our brokerage platform reliable, observable, and operable as we grow - working across our cloud infrastructure, Kubernetes platform, observability stack, messaging layer, and data layer. We're especially interested in candidates with strong PostgreSQL fundamentals who'd like to grow into deeper ownership of our database reliability posture: PostgreSQL sits on the trading-critical path, and we want this person to spend a meaningful share of their time leveling it up while still being a well-rounded SRE the rest of the week.

Things You Get To Do
  • Operate production day-to-day - oncall, incident response, postmortems, and the follow-ups that actually close the loop.
  • Own reliability practice - define and refine SLIs/SLOs and error budgets, and help product teams live within them.
  • Strengthen our observability across metrics, logs, traces, and alerting.
  • Ship infrastructure through code in a GitOps workflow - cloud resources and Kubernetes workloads alike.
  • Look after PostgreSQL: performance tuning, schema and migration review, online migrations on large tables, HA/DR, and CDC pipelines.
  • Mentor engineers on reliability and database fundamentals through code review, design review, and pairing.
Who You Are (must-haves)
  • 4+ years in SRE, DevOps, Platform/Infrastructure, or backend engineering with significant production operations ownership.
  • Hands-on experience operating production services on Kubernetes, and shipping infrastructure as code in a GitOps workflow.
  • Solid working knowledge of PostgreSQL in production — query plans, pg_stat_*, indexing and schema trade-offs, and what a safe online migration looks like on a non-trivial table.
  • Cloud networking fundamentals (VPCs, routing, L4/L7 load balancing, DNS, TLS) and comfort debugging cross-service connectivity.
  • Comfortable with a modern observability stack and proficient with Linux at the operator level.
  • Practiced in incident response - calm under pressure, structured debugging, postmortems that drive change.
  • At least working proficiency in Go or Python, plus strong written and verbal communication.
  • Genuine interest in databases and in growing your PostgreSQL/DBA expertise.
Who You Might Be (Nice-to-Haves):
  • Deeper PostgreSQL experience: large clusters at OLTP load, online migrations on big tables, HA/DR ownership, connection pooling at scale, or change-data-capture pipelines.
  • Experience with typed SQL access layers in Go (e.g. pgx, gorm, sqlc).
  • Production experience with messaging systems at scale (e.g. RabbitMQ, Kafka, Redpanda).
  • Security & compliance experience in a regulated environment (SOC 2, secrets management, audit logging).
  • Familiarity with trading, brokerage, or other regulated fintech domains.
How We Take Care of You:
  • Competitive Salary & Stock Options
  • Health Benefits
  • New Hire Home-Office Setup: One-time USD $500
  • Monthly Stipend: USD $150 per month via a Brex Card

Alpaca is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse workforce.

Recruitment Privacy Policy

Skills Required

  • 4+ years in SRE, DevOps, Platform/Infrastructure, or backend engineering with production operations ownership
  • Hands-on experience operating production services on Kubernetes
  • Shipping infrastructure as code in a GitOps workflow
  • Solid working knowledge of PostgreSQL in production (query plans, pg_stat_*, indexing, schema trade-offs, safe online migrations)
  • Cloud networking fundamentals (VPCs, routing, L4/L7 load balancing, DNS, TLS) and debugging cross-service connectivity
  • Comfortable with a modern observability stack (metrics, logs, traces)
  • Proficient with Linux at the operator level
  • Practiced in incident response, structured debugging, and running postmortems
  • Working proficiency in Go or Python and strong written and verbal communication
  • Genuine interest in databases and growing PostgreSQL/DBA expertise
  • Deeper PostgreSQL experience: large OLTP clusters, online migrations, HA/DR, connection pooling, CDC pipelines
  • Experience with typed SQL access layers in Go (pgx, gorm, sqlc)
  • Production experience with messaging systems at scale (RabbitMQ, Kafka, Redpanda)
  • Security and compliance experience in regulated environments (SOC 2, secrets management, audit logging)
  • Familiarity with trading, brokerage, or regulated fintech domains
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
San Mateo, CA
132 Employees
Year Founded: 2015

What We Do

Alpaca's mission is to unlock asset management for the people. We are a technology company that modularizes the world’s asset management activities. Alpaca’s products enable anyone to build and connect applications and algorithms to buy and sell stocks with zero commissions. We believe that everyone should have fair access to financial markets, regardless of who we are or where we are from. *Securities are offered through Alpaca Securities LLC (alpaca.markets)*

Similar Jobs

N-iX Logo N-iX

Site Reliability Engineer

Information Technology • Consulting
Remote
27 Locations
2135 Employees

Replit Logo Replit

Site Reliability Engineer

Artificial Intelligence • Cloud • Machine Learning • Software • Database • App development • Generative AI
Remote
26 Locations
300 Employees

Nebius Logo Nebius

Senior Site Reliability Engineer

Artificial Intelligence • Information Technology • Consulting
In-Office or Remote
27 Locations
473 Employees

Menlo Security Inc. Logo Menlo Security Inc.

Infrastructure Engineer

Cloud • Security • Cybersecurity
Remote
27 Locations
312 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account