Teraswitch Inc.

Senior Infrastructure Engineer (KVM Compute / Distributed Storage)

Reposted 24 Days Ago

Be an Early Applicant

Pittsburgh, PA, USA

In-Office

Senior level

Information Technology

Teraswitch provides public cloud, single tenant servers, storage, low latency connectivity via 15 global locations.

The Role

The Senior Infrastructure Engineer will design and implement KVM-based cloud and distributed storage services, focusing on automation, scalability, and security while collaborating with engineering teams.

Summary Generated by Built In

Engineered to outperform, Teraswitch is on a mission to provide high-performance infrastructure services for critical workloads. With 20+ datacenter locations around the world interconnected by our low latency global backbone network, we are the class leader in performance bare metal hosting and rapidly expanding into additional infrastructure services.

The Job

The Infrastructure Engineering team at Teraswitch is responsible for the compute, storage, and platform infrastructure that powers our products and internal operations.

This senior/staff-level role is focused on building provider-grade hosted compute and storage services—specifically a KVM-based VM product and a distributed object (S3) and block storage product (NVMe/TCP). Qualified candidates will have depth in at least one of these areas. You will help architect and build cloud-scale, globally distributed products for a high-performance infrastructure provider, with an emphasis on automation, scalability, and security by design.

While this role has a compute and storage services focus, as a senior member of the Infrastructure Engineering team, you’ll also be expected to cross-train and contribute broadly across infrastructure domains as we grow the team.

What You’ll Do

Design and implement provider-scale, globally distributed hosted services - with a focus in either compute (KVM-based cloud), storage (distributed object and block services), or both
- Compute track: Evaluate/design, implement, and manage a KVM-based cloud compute platform
- Storage track: Evaluate, implement, and manage a distributed storage platform (Ceph, Weka, VAST, etc) that supports object (S3) and block (NVMe/TCP) protocols
Define provisioning workflows, node/fleet management, and scalable operations
Integrate service networking primitives (IPAM, DHCP, DNS) and customer interfaces to the product
Design multi-tenant provisioning and controls: isolation boundaries, quotas/limits, metering, and security
Build automation and tooling for global deployments of these products: upgrades, capacity expansion, failure handling, rebalancing
Implement robust observability for these products to enhance production service reliability (metrics, logs, traces; dashboards; actionable alerting)
Collaborate with the Software team to integrate these products with our customer control plane (portal, API) and billing systems, ensuring robust customer-driven lifecycle management
Cross-train with the rest of the Infrastructure Engineering team and contribute broadly to the compute, storage, and platform infrastructure that powers Teraswitch products and internal operations

Basic Qualifications

Strong Linux systems and networking expertise, production operations experience
Depth in at least one of the following:
- Compute / virtualization: KVM/QEMU, libvirt and/or platforms such as Proxmox/OpenStack; image pipelines; fleet operations; multi-tenant considerations
- Distributed storage services: experience with distributed storage platforms (Ceph, VAST, Weka, or similar) and/or managing block/object storage offerings; public/multi-tenant deployment experience is a plus
Automation - experience in scripting (Python, bash, etc) and/or configuration management (Ansible or similar)
Experience with observability/monitoring systems (metrics, logs, traces, alerting) and using them to enhance production service reliability
Comfortable working in a fast-paced, results-oriented environment
Committed to operational best practices and security by design

Preferred Skills/Experience

You do not need all of these—depth in a few areas plus strong fundamentals is sufficient:

Service / hosting provider experience (multi-tenant systems, automation-first operations, scalable and secure design)
Experience with VPS/KVM hosting at scale, including networking and security
Experience with distributed storage systems such as Ceph, Weka, or VAST, particularly in a service provider environment
Expertise in object storage / S3 services - gateway/front-door patterns (F5/Nginx/HAProxy), networking, durability, security
Strong networking fundamentals relevant to provider environments (routing/segmentation, IPAM/DHCP/DNS integration)
Cloud-native observability/monitoring (e.g. Prometheus, Grafana, OpenTelemetry)
Kubernetes and cloud-native (CNCF) ecosystem experience
Demonstrated ability to design and operate automation-first infrastructure at scale
Experience in other Infrastructure team domains - e.g. self-hosted Kubernetes deployment / management, and/or bare metal automation and fleet management

On-Call / Operations

Participate in an on-call system supporting critical production systems.

Location

Preference given to full-time onsite candidates in Pittsburgh, PA, followed by hybrid candidates.

Compensation and Benefits

Along with a competitive pay scale, full-time Teraswitch employees are eligible for the following benefits:

Health, Dental, and Vision Insurance
401(k) with company profit sharing
PTO and 11 Company Paid Holidays

Skills Required

Strong Linux systems and networking expertise, production operations experience
Depth in compute/virtualization such as KVM/QEMU, libvirt, or platforms like Proxmox/OpenStack
Depth in distributed storage services experience with platforms like Ceph, VAST, Weka
Experience in scripting (Python, bash) and/or configuration management (Ansible)
Experience with observability/monitoring systems

View all jobs at Teraswitch Inc.

View Teraswitch Inc. Profile

Report Job

Am I A Good Fit?

beta