Zipline’s Platform 1 system powers our long-range autonomous aircraft and delivery infrastructure, an integrated stack of on-prem hardware, robotics, and cloud-connected services that must perform flawlessly, around the clock, in the real world. As a DevOps Engineer, you’ll be part of the team that ensures these systems remain reliable, observable, and scalable as we expand globally. You’ll work across the boundary between software and hardware building monitoring frameworks, automating deployments, and managing the infrastructure that keeps Zipline’s physical operations connected and performing. You are someone who thrives in complex environments, loves solving systems challenges, and takes pride in building reliability into everything you touch. You bring technical depth, hands-on expertise, and a mindset that blends engineering precision with operational pragmatism.
What You'll Do- Ensure reliability and uptime of Platform 1’s hybrid infrastructure, spanning on-prem servers, edge devices, and infrastructure for cloud-based services.
- Support the work of application engineers deploying software - by owning the deploy toolchain and management of the infra the services run on.
- Design, implement, and evolve observability systems; metrics, logging, tracing, and alerting, to provide deep visibility into system health and performance.
- Automate and scale maintenance operations for our on premise servers, reducing manual intervention and improving deployment repeatability using tools like Terraform and Ansible.
- Administer and optimize Linux systems and network configurations that support mission-critical operations.
- Lead and participate in incident response, driving both quick resolution and long-term prevention through post-incident analysis and automation.
- Partner with software, flight systems, and operations teams to diagnose, resolve, and prevent system-level issues across environments.
- Become THE in-house expert for DevOps on Platform 1 – learn, understand, and work to improve our compute infrastructure and development practices.
- Continuously improve standards and processes for system configuration, deployment, and monitoring, helping raise the technical bar for reliability at Zipline.
- 6+ years of professional experience in DevOps, Site Reliability, and/or Infrastructure Engineering roles.
- Deep expertise in Linux systems administration, performance tuning, and troubleshooting.
- Experience managing and scaling on-prem and hybrid infrastructure environments.
- Proficiency in monitoring and logging tools (Prometheus, Grafana, ELK, etc.) and a strong understanding of observability principles.
- Familiarity with infrastructure-as-code tools (e.g., Terraform, CDK).
- Scripting or programming skills in Python, and Bash.
- Strong communication and cross-functional collaboration skills—you work well across hardware, software, and operations domains.
- A problem-solving mindset, with the grit and adaptability to thrive in dynamic, evolving systems.
- Experience with container orchestration (Kubernetes, Docker/DockerCompose); huge plus if this experience is in hybrid or on-prem deployments.
- Background in networking, bare metal server management or robotics infrastructure is a plus.
- Familiarity with CI/CD and deployment pipelines for hardware-software systems is a plus.
Top Skills
What We Do
Zipline is the world's largest autonomous delivery network and is powered entirely by fixed-wing drones. Our fleet circles the equivalent distance of the equator every 2.5 days, and we have shipped hundreds of thousands of critical medical products across Rwanda, Ghana, and now beginning in the United States.
Why Work With Us
Zipline is the perfect intersection of super cutting-edge tech, deep social mission, and extremely compelling business case. Our small, scrappy, customer-obsessed, humble, and mission-driven team has set the bar for what is possible in the drone logistics industry globally, and has designed some incredibly elegant technology in the process.
Gallery
.png)








