Gauntlet Jobs

Infrastructure Engineer

Gauntlet

Infrastructure Engineer

Reposted 17 Hours Ago

Hiring Remotely in New York, NY, USA

In-Office or Remote

Senior level

Blockchain • Fintech • Software • Cryptocurrency

The Role

Build and maintain cloud-native infrastructure on GCP: own CI/CD, Terraform modules, Kubernetes/Helm deployments, observability consolidation, resilience and security improvements, and automate routine ops with AI agents while supporting application teams and SOC 2 readiness.

Summary Generated by Built In

You will help build the infrastructure behind one of the largest asset managers in onchain finance. Gauntlet serves $1.5B+ in client TVL, and the platform that ships and runs it is small, modern, and yours to shape. This is the team's second infrastructure hire, you'll work directly with our infra/platform lead, own real surface area from week one, and have a genuine say in the tools, patterns, and direction we take. If you want hands-on ownership of a cloud-native platform rather than a narrow slice of someone else's, read on.

About Gauntlet

Gauntlet builds the financial systems of the future. While much of onchain finance is focused on point solutions, we operate across the entire stack to offer best-in-class vault products. Today we serve over $1.5B in client TVL across some of the largest fintechs/neobanks, protocols, exchanges, and capital allocators in crypto — and, increasingly, traditional asset management. Our team brings together traditional finance and crypto-native expertise to deliver durable, sophisticated products for institutional clients moving onchain.

The role

Infrastructure & Security keeps Gauntlet's services shipping safely and reliably. Today it's effectively one engineer carrying both large, org-spanning initiatives (platform build-out, SOC 2, deployment security) and the steady stream of day-to-day requests from product teams. You'll take real ownership of that workload across our GCP, Kubernetes, and Terraform stack: unblocking application teams, hardening CI/CD, and driving infrastructure projects end-to-end so the platform can scale with the company. You'll partner closely with the application teams (Aera, Vault Curation) and with Security.

What you'll do;

Support the application teams: turn around infra requests (permissions, roles, service setup, project peering) so product engineers stay focused on shipping.

Own CI/CD and deployments: maintain and extend our GitHub Actions workflows and help migrate toward a dedicated CD tool with proper permissioning — the goal is fully automated, locked-down deploys via service accounts, no direct engineer access to production.

Build and maintain infrastructure as code: author and update Terraform modules for new and existing services across GCP environments.

Run Kubernetes the right way: manage service deployments via Helm (we're on Helm 4) keep async workloads healthy on Dagster.

Unify observability (likely first project): consolidate today's per-team alerting into a single view — system-to-system dashboards plus incident alerting that routes upstream service/vendor failures to the right impacted teams and on-call rotations.

Advance resilience: help move us toward a fully region- and cloud-agnostic posture so services can pick up and move if something fails.

Strengthen security & access: apply IAM, secrets management, least privilege, and auditability; contribute to SOC 2 readiness.

Automate with AI: build agent skills / agents.md so routine tasks (provisioning access, simple changes) can be handled by an agent instead of human engineering hours, and use AI to reason through bigger problems.

What Success looks like;

First 30 days. Ramp on the stack (GCP, Kubernetes/Helm, Terraform, GitHub Actions, Dagster). Meet the application and security stakeholders, and start reliably handling application-team requests.

First 90 days. Operating independently on the reactive workload and proactively creating/updating/managing infrastructure across GCP environments. On-call onboarding complete (Roby shadows then reverse-shadows your first shifts).

In 1 year. Delivered concrete platform improvements — new Terraform modules meeting app-team needs, upstream dependency upgrades, and a unified alerting/observability framework wired into incident reporting and on-call. Trusted to take significant infra projects off the lead's plate.

What you bring;

Strong software-engineering fundamentals in at least one production language (Python, Go, TypeScript, or Rust); Python especially valued, plus comfort scripting and working in the shell.

Hands-on experience with cloud infrastructure and core cloud services, especially GCP (AWS/Azure transferable).

Experience operating large-scale Kubernetes production systems.

Experience with Infrastructure as Code, especially Terraform.

Familiarity with CI/CD systems, especially GitHub Actions or Octopus Deploy.

Ability to debug production issues using logs, metrics, traces, shell tools, and source code.

Security and access-control fundamentals: IAM, secrets management, least privilege, and auditability.

Clear written communication around incidents, design decisions, and operational procedures.

Bonus points

Supporting SOC 2 controls - evidence collection, access reviews, change management, or audit readiness.

Observability with Datadog, Prometheus, Grafana, OpenTelemetry, Honeycomb, or similar.

Improving developer experience through internal tooling, templates, scripts, or platform APIs.

Incident response experience, including postmortems and follow-up remediation.

Experience with Dagster, Helm 3+, high-scale CD tooling (Bazel, Octopus), or AI/agent-assisted ops.

Basic web3 / DeFi literacy (transactions, wallets) and genuine curiosity about onchain — the role doesn't touch chain directly, but the business is onchain.

Skills Required

Strong software-engineering fundamentals in at least one production language (Python, Go, TypeScript, or Rust)
Hands-on experience with cloud infrastructure and core cloud services, especially GCP
Experience operating large-scale Kubernetes production systems
Experience with Infrastructure as Code, especially Terraform
Familiarity with CI/CD systems, especially GitHub Actions or Octopus Deploy
Ability to debug production issues using logs, metrics, traces, shell tools, and source code
Security and access-control fundamentals: IAM, secrets management, least privilege, and auditability
Clear written communication around incidents, design decisions, and operational procedures
Manage service deployments via Helm and keep async workloads healthy on Dagster
Supporting SOC 2 controls, evidence collection, access reviews, change management, or audit readiness
Observability experience (Datadog, Prometheus, Grafana, OpenTelemetry, Honeycomb, or similar)
Improving developer experience through internal tooling, templates, scripts, or platform APIs
Incident response experience, including postmortems and remediation
Experience with high-scale CD tooling (Bazel, Octopus) or AI/agent-assisted ops
Basic web3 / DeFi literacy (transactions, wallets) or curiosity about onchain finance

View all jobs at Gauntlet

View Gauntlet Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: New York, NY

29 Employees

Year Founded: 2018

What We Do

Gauntlet’s mission is to drive understanding and participation in the financial systems of the future. Building decentralized systems creates new challenges for protocol developers, smart contract developers, and asset holders that are not seen in traditional development and investing. Gauntlet has created a blockchain simulation and testing platform that leverages battle tested techniques from other industries to build financial models of crypto products. Simulation provides transparency and greatly reduces the cost of experimentation so that blockchain protocols and smart contracts are safe for users.