Rezdy

Staff Platform Engineer

Posted 23 Days Ago

Be an Early Applicant

Country States, Pájaros Barrio, Bayamón, PRI

In-Office

Senior level

Software • Travel

The Role

Own and evolve infrastructure for a new product: AWS environments, IaC (Pulumi/TypeScript or Terraform), CI/CD, container platform, PostgreSQL operations, observability, incident readiness, release safety, and developer experience. Mentor engineers and design scalable, reliable, secure platform patterns and operational practices.

Summary Generated by Built In

Staff Platform Engineer
About the Role

We’re hiring a Staff DevOps Engineer to join Manifest, a new product being built in a high-autonomy, fast-moving environment.

This is a hands-on, staff-level role for someone who can own critical infrastructure, improve the developer experience, and partner closely with product engineers, DevOps leadership, and technical leads. We’re looking for someone who can operate production systems, but also design the guardrails, patterns, and platform capabilities that allow the team to move faster and more safely over time.

This role is a strong fit for someone who enjoys working close to the product team, understands the realities of building in a startup-like environment, and can bring structure, reliability, and technical depth to a fast-moving team.

What You’ll Do

Work on a team with two other platform engineers.
Own and evolve the infrastructure that supports Manifest, including AWS environments, networking, compute, data services, observability, CI/CD, and operational tooling.
Work with Pulumi and TypeScript to define, maintain, and improve infrastructure as code across the platform.
Support and improve our containerized application platform, including deployment pipelines, rollback mechanisms, and runtime configuration.
Help operate and harden our data infrastructure, including connection pooling, backups, disaster recovery, replication, and safe schema-change practices.
Partner with engineers to improve the reliability and safety of releases, including database migrations, deployment workflows, environment management, and production readiness checks.
Improve CI/CD workflows so that builds, tests, infrastructure changes, and deployments are fast, reliable, and easy for engineers to understand.
Lead observability and incident readiness work, including alerting, dashboards, SLOs, runbooks, incident response practices, and post-incident follow-up.
Help ensure the platform is secure, cost-conscious, and maintainable as the product scales.
Mentor engineers on infrastructure, operations, reliability, and production ownership.

What We’re Looking For
We’re looking for someone who has operated meaningful production systems and can bring staff-level judgment to infrastructure, reliability, and developer experience.Strong candidates will have:

Deep production experience with AWS, especially services such as ECS/Fargate, RDS/Aurora PostgreSQL, VPC networking, load balancing, IAM, KMS, Secrets Manager, CloudFront, WAF, and related managed services.
Experience designing and operating systems that serve a global user base, seamless multi-region availability, and disaster recovery procedures.
Treats reliability, scalability, performance, and observability as a first-class design constraint, building these into designs from the start rather than bolting them on later.
Strong infrastructure-as-code experience. Pulumi with TypeScript is ideal, but deep experience with Terraform or another mature IaC approach is also valuable.
Strong operational knowledge of PostgreSQL, including performance investigation, connection pooling, backups, replication, locking, migrations, and safe schema-change patterns.
Experience designing and maintaining CI/CD systems, ideally with GitHub Actions, OIDC-based cloud authentication, container builds, environment promotion, required checks, and deployment gates.
Experience supporting containerized production workloads and improving deployment safety, rollback strategies, and runtime reliability.
Strong observability and incident response experience, including metrics, logs, traces, alerting, dashboards, runbooks, and post-incident learning.
The ability to work effectively in ambiguity, make pragmatic tradeoffs, and communicate clearly with both infrastructure specialists and product engineers.
A track record of raising the engineering bar through reusable patterns, documentation, automation, mentoring, and thoughtful technical leadership.

Our Environment
Manifest operates with a lean process and a high degree of ownership. Engineers are expected to work effectively in ambiguity, clarify requirements, collaborate directly across functions, and ship pragmatic, high-quality solutions.

The DevOps function is critical to that operating model. Resilient, well-planned infrastructure is critical, but we also do not want speed to come at the expense of reliability, security, or maintainability. This role exists to help Manifest find that balance as the product moves toward launch and scale.

You’ll work closely with product engineers, technical leads, DevOps leadership, and other stakeholders to ensure the platform is ready for real customers, real traffic, and real operational demands.

Why Join

This is an opportunity to help shape the foundation for a new product at an important stage.

You’ll be joining early enough to have real influence over how Manifest operates, deploys, scales, and responds to incidents. You’ll work on meaningful infrastructure problems, partner with a highly autonomous engineering team, and help define the standards that will carry the product into production and beyond.

If you’re excited by the combination of hands-on infrastructure work, production reliability, developer experience, and staff-level technical leadership, we’d love to talk.

Skills Required

Operated meaningful production systems and demonstrated production reliability operations
Deep AWS experience including ECS/Fargate, RDS/Aurora PostgreSQL, VPC, load balancing, IAM, KMS, Secrets Manager, CloudFront, WAF
Designing and operating multi-region availability, disaster recovery, and global-serving systems
Infrastructure-as-code experience (Pulumi with TypeScript ideal; Terraform or another mature IaC also acceptable)
Strong operational PostgreSQL knowledge: performance investigation, connection pooling, backups, replication, migrations, safe schema-change patterns
Designing and maintaining CI/CD systems (ideally GitHub Actions, OIDC-based auth, container builds, environment promotion, deployment gates)
Experience supporting containerized production workloads, deployment safety, rollback strategies, and runtime reliability
Observability and incident response experience (metrics, logs, traces, alerting, dashboards, runbooks, post-incident follow-up)
Treat reliability, scalability, performance, and observability as first-class design constraints
Ability to work effectively in ambiguity, make pragmatic tradeoffs, and communicate with infrastructure and product engineers
Mentoring, documentation, automation, and raising the engineering bar through reusable patterns and technical leadership

View all jobs at Rezdy

View Rezdy Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

Surry Hills, New South Wales

112 Employees

Year Founded: 2011

What We Do

Rezdy is the world’s leading independent online booking and distribution platform, powering the experiences industry. Launched in 2011, Rezdy was started when founder Simon worked in a dive centre in Thailand, and found himself spending more time behind a desk handling customer bookings and admin than out in the water sharing his love of diving with clients. With a previous background in IT, Simon set out to solve his own problem, and in doing so discovered a gap in the market. He had a vision to help people like him at that point to get more bookings and grow their business with less effort, empowering them to get back to doing what they love. Rezdy is proud to work with thousands of tour and activity operators and agents of all sizes in over 130 countries to help them get more bookings and grow their business. Today, Rezdy has headquarters in Sydney, Australia and in Raleigh U.S.A. and over $1.3 billion in tour and activity bookings processed through their platform every year. Rezdy's mission: "To power the growth of the experiences industry with tools and connections to make life easier." Rezdy's values: "Nurture your adventurous spirit" We are all leaders, in search of a better way We should be brave and curious; ready for anything "Own it, Make it Happen" We are all agents of positive change We get stuff done! "Achieve More, Together" We are one team, united in purpose and journey We support, collaborate and learn from each other to drive better results