Infra/DevOps Engineer

Reposted 11 Days Ago
28 Locations
Remote
Senior level
Artificial Intelligence • Security • Cybersecurity
AI meets Vulnerability Management.
The Role
As an Infra/DevOps Engineer, you'll design, implement, and maintain our infrastructure, ensuring scalability, security, and reliability while collaborating with engineers and optimizing performance.
Summary Generated by Built In

Summary of the Role:

As Infra/DevOps Engineer at Maze, you'll be the architect of our complex, multi-account Kubernetes infrastructure, building and scaling the foundation that powers our AI-driven cybersecurity platform across isolated enterprise environments. This is a unique opportunity to join as one of the early engineering team members of a well-funded startup building at the intersection of generative AI and cybersecurity. You'll design, code, and maintain sophisticated infrastructure spanning 12-15 AWS accounts, each with dedicated Kubernetes clusters, ensuring complete data segregation for our security-conscious enterprise customers.

You'll take full ownership of our infrastructure-as-code implementation, managing multiple Kubernetes clusters at scale using cutting-edge tools like Karpenter, Flux, and Kustomize. Your success will be measured by infrastructure reliability, deployment velocity, and your ability to build self-managed, distributed systems that scale elegantly as we grow from startup to enterprise scale. This role is perfect for a hands-on infrastructure engineer who has mastered complex Kubernetes deployments at scale, writes production-grade infrastructure code, and thrives on building simple, elegant solutions to complex distributed systems challenges.

Your Contributions to Our Journey:

  • Architect Multi-Cluster Kubernetes Infrastructure: Design, implement, and write infrastructure-as-code for our complex Kubernetes setup spanning multiple AWS accounts, ensuring each cluster is completely isolated for enterprise security requirements while maintaining operational efficiency

  • Build Self-Managed, Distributed Systems: Develop infrastructure that manages itself through GitOps workflows using Flux and Kustomize, creating distributed systems where actions in one place automatically trigger appropriate changes across the infrastructure without manual intervention

  • Scale Kubernetes Operations: Manage and optimize dozens of Kubernetes clusters across our multi-tenant and single-tenant environments, implementing auto-scaling solutions with Karpenter and ensuring seamless scaling as customer workloads grow exponentially

  • Develop Production-Grade Automation: Write robust, maintainable code to build and maintain CI/CD pipelines, custom automation tools, and deployment scripts that enable rapid feature delivery while maintaining the highest reliability standards

  • Ensure Enterprise Security: Implement security best practices and compliance measures that protect our highly sensitive security data, managing firewalls, encryption, IAM policies, and network segregation across our multi-account AWS architecture

  • Optimize Platform Performance: Build comprehensive monitoring, logging, and alerting systems that proactively identify issues, using tools like Prometheus and Grafana to ensure our infrastructure scales efficiently as we handle increasingly complex workloads

  • Enable Engineering Velocity: Work closely with backend and data engineering teams to build self-service infrastructure capabilities, allowing teams to provision databases, deploy services, and scale resources independently without constant infrastructure team involvement

What You Need to Be Successful:

  • Kubernetes Mastery at Scale: 5+ years of infrastructure/DevOps experience with deep, hands-on expertise managing complex Kubernetes deployments—you must have experience with multiple Kubernetes clusters (tens of clusters) in sophisticated setups, not just simple single-cluster environments

  • GitOps and Modern K8s Tooling: Proven production experience with Karpenter (for auto-scaling), Flux (for GitOps), and Kustomize (for configuration management)—if you have these three, you'll be a fish in the water with our infrastructure approach

  • AWS Infrastructure Expertise: Deep knowledge of AWS with hands-on experience managing complex multi-account architectures, understanding how to design for isolation, security, and scalability across numerous AWS accounts with proper networking and IAM configuration

  • Infrastructure-as-Code Excellence: Strong coding skills with production experience using Terraform or CloudFormation, writing maintainable, well-architected infrastructure code that follows best practices and scales with organizational growth

  • Simplicity-Driven Architecture: Proven ability to build simple, elegant solutions to complex infrastructure problems—you instinctively know the "right way" to use tools like Helm charts and avoid over-engineering while maintaining scalability

  • Platform Thinking: Experience building infrastructure with a platform mindset, creating systems that support multiple products and enable team self-service rather than building one-off solutions for individual applications

  • AWS Managed Services Philosophy: Understanding of when to use AWS managed services (RDS, MSK, EMR) versus building custom solutions, with experience scaling startups using managed services efficiently before investing in complex self-hosted infrastructure

  • Distributed Systems Mindset: Deep understanding of distributed systems principles with experience building infrastructure that is decentralized rather than centralized, allowing independent operation across multiple clusters and regions

  • Nice to haves:

    • Experience with AWS auto-scaling across complex, multi-cluster environments

    • Background in security-focused infrastructure or handling sensitive enterprise data

    • Previous experience at scale-ups that grew infrastructure from 20-100+ engineers

    • Knowledge of infrastructure observability tools beyond Prometheus/Grafana (e.g., ELK Stack)

    • Track record of building infrastructure that went through SOC2, ISO, or similar compliance certifications

Why Join Us:

  • Ambitious Infrastructure Challenges: We're using generative AI (LLMs and agents) to solve critical cybersecurity challenges, requiring sophisticated infrastructure that handles sensitive security data across isolated enterprise environments. You'll build the foundation for breakthrough AI-powered security solutions at unprecedented scale.

  • Expert Team: We are a team of hands-on leaders with deep experience in Big Tech and Scale-ups. Our team has been part of the leadership teams behind multiple acquisitions and an IPO.

  • Impactful Work: Cybersecurity is a force for good—helping stop cyber attacks ultimately helps deliver better outcomes for all of us. The infrastructure you build will directly enable security teams to protect organizations worldwide from real threats.

  • Build an AI-Native Company: We're building a new company in the AI era with the opportunity to design everything from the ground up—you'll architect infrastructure using cutting-edge Kubernetes practices and establish platform standards that will scale with us from startup through hypergrowth.

  • Technical Leadership Growth: Direct partnership with experienced engineering leadership, significant equity upside, and the opportunity to own and shape the entire infrastructure function as we scale our platform to support the world's largest enterprises.

Top Skills

Ansible
AWS
Azure
Chef
CloudFormation
Elk Stack
GCP
Grafana
Prometheus
Puppet
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: London
36 Employees
Year Founded: 2024

What We Do

AI Agents that investigate and resolve security vulnerabilities.

Similar Jobs

GitLab Logo GitLab

Security Engineer

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
28 Locations
2500 Employees

MacPaw Logo MacPaw

Head of Growth

Information Technology • Security • Software • Cybersecurity • App development • Data Privacy
Remote or Hybrid
28 Locations
550 Employees

MacPaw Logo MacPaw

Growth Manager for CleanMyMac

Information Technology • Security • Software • Cybersecurity • App development • Data Privacy
Remote or Hybrid
28 Locations
550 Employees

GitLab Logo GitLab

Back-end Engineer

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
29 Locations
2500 Employees

Similar Companies Hiring

Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account