Senior Site Reliability Engineer / Kubernetes (Remote)

Posted Yesterday
Be an Early Applicant
10 Locations
In-Office or Remote
Senior level
Information Technology • Software
The Role
Operate and scale production Kubernetes clusters across bare-metal, virtualized and on-prem environments. Manage Linux infrastructure (Debian/Ubuntu), networking (VLANs, L2/L3, VPNs), automation (Ansible, Bash/Python, GitOps), observability (Prometheus/Grafana, ELK/Loki/Graylog), virtualization (OpenStack/Proxmox/VMware), bare-metal provisioning (MAAS/PXE), incident response, SLO/SLI definition, on-call rotations, SOPs, and cross-team architecture and maintenance coordination.
Summary Generated by Built In

Job Description

Location: Fully remote EU timezone (CET ±2h)
Start date: ASAP
Languages: Fluent English is mandatory
Industry: Cloud Computing

We are hiring at Pragmatike to expand our team and drive the growth of our internal projects.

Our focus is on developing cutting-edge solutions in Cloud Computing, while fostering a culture of collaboration and innovation. Joining us means being part of a passionate team where your ideas and skills directly contribute to shaping tomorrows technologies.

If you're excited about working on ambitious projects in a dynamic and flexible environment, we'd love to hear from you!

 
Responsibilities
  • Operate and maintain Linux-based infrastructure (Debian/Ubuntu).

  • Deploy, manage, and scale Kubernetes clusters across bare-metal, virtualized, and on-prem environments.

  • Oversee full cluster lifecycle: upgrades, node pools, networking, storage, and security hardening.

  • Implement automation for provisioning and operations using Ansible, Bash/Python, and GitOps workflows.

  • Design and maintain networking architecture including VLANs, L2/L3 routing, VPNs, and multi-site connectivity.

  • Build automated deployment workflows (PXE boot, Preseed, cloud-init).

  • Deploy and maintain observability stacks (Prometheus/Grafana, Loki, ELK, Graylog).

  • Lead incident response and escalation activities across the platform.

  • Improve system availability and reduce latency at all levels.

  • Define and implement SLOs/SLIs at multiple infrastructure levels (physical network/hardware, platform virtualization, software services).

  • Optimize alerting and monitoring pipelines to provide actionable insights.

  • Establish and maintain on-call schedules to ensure coverage across timezones.

  • Develop Standard Operating Procedures (SOPs) for repeatable operations and maintenance tasks.

  • Coordinate physical maintenance for Policlouds (periodic maintenance, hardware issues, DC-Ops).

  • Manage virtualization and orchestration layers (OpenStack, Proxmox, VMware).

  • Help develop and maintain overall architecture across all products.

  • Plan resources for future initiatives, accounting for demand and growth projections.

  • Work with development teams to improve overall quality and optimize resource utilization.

  • Collaborate with cross-functional stakeholders (Hivenet, Policloud, Customer Success teams).

Requirements
  • Expert-level, hands-on experience operating Kubernetes in production environments.

  • Strong network engineering skills (VLANs, L2/L3 routing, VPNs, multi-site connectivity) - this is essential for the role.

  • Strong proficiency with Linux systems administration (Debian/Ubuntu).

  • Solid understanding of networking fundamentals and ability to design complex network architectures.

  • Experience building and maintaining automation workflows (Ansible, Bash/Python, Git-based).

  • Experience with observability stacks such as Prometheus, Grafana, ELK, Loki, or Graylog.

  • Background with virtualization technologies (OpenStack, Proxmox, VMware).

  • Experience with bare-metal provisioning and MAAS (Metal as a Service).

  • Strong understanding of distributed systems and container orchestration.

  • Process-oriented mindset with ability to develop SOPs and operational procedures from scratch.

  • Experience with incident response, escalation procedures, and on-call rotations.

  • Ability to work autonomously in a fast-paced, engineering-driven environment.

  • Strong technical skills combined with alignment to team values.

Nice To Have
  • Experience with service mesh (Istio, Linkerd) or advanced CNI implementations.

  • Knowledge of Cloudflare APIs, DNS automation, or tunnel configurations.

  • Experience with GPU infrastructure, node preparation, or resource scheduling.

  • Familiarity with security best practices (RBAC, firewalls, network policies).

  • Exposure to IT asset management or license tracking workflows.

  • Experience working in multi-timezone environments and coordinating across distributed teams.

  • Background establishing reliability practices and SRE frameworks in growing organizations.

Why Join Us:

  • 100% remote work with flexible hours

  • High-impact role with autonomy and ownership

  • Collaborative and international engineering team

  • Cutting-edge tech stack with strong focus on reliability and automation.

Skills Required

  • Expert-level, hands-on experience operating Kubernetes in production environments
  • Strong network engineering skills (VLANs, L2/L3 routing, VPNs, multi-site connectivity)
  • Strong proficiency with Linux systems administration (Debian/Ubuntu)
  • Solid understanding of networking fundamentals and ability to design complex network architectures
  • Experience building and maintaining automation workflows (Ansible, Bash/Python, Git-based)
  • Experience with observability stacks such as Prometheus, Grafana, ELK, Loki, or Graylog
  • Background with virtualization technologies (OpenStack, Proxmox, VMware)
  • Experience with bare-metal provisioning and MAAS (Metal as a Service)
  • Strong understanding of distributed systems and container orchestration
  • Process-oriented mindset with ability to develop SOPs and operational procedures from scratch
  • Experience with incident response, escalation procedures, and on-call rotations
  • Ability to work autonomously in a fast-paced, engineering-driven environment
  • Fluent English
  • Ability to work CET ±2h timezone (EU timezone)
  • Experience with service mesh (Istio, Linkerd) or advanced CNI implementations
  • Knowledge of Cloudflare APIs, DNS automation, or tunnel configurations
  • Experience with GPU infrastructure, node preparation, or resource scheduling
  • Familiarity with security best practices (RBAC, firewalls, network policies)
  • Exposure to IT asset management or license tracking workflows
  • Experience working in multi-timezone environments and coordinating across distributed teams
  • Background establishing reliability practices and SRE frameworks in growing organizations
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Marseille
11 Employees
Year Founded: 2022

What We Do

Trusted by remote-first companies worldwide. Completing tech projects for startups and scaleups.

Similar Jobs

Pragmatike Logo Pragmatike

Senior Site Reliability Engineer

Information Technology • Software
In-Office or Remote
9 Locations
11 Employees

Pragmatike Logo Pragmatike

Senior Site Reliability Engineer

Information Technology • Software
In-Office or Remote
10 Locations
11 Employees

Pragmatike Logo Pragmatike

Senior Site Reliability Engineer

Information Technology • Software
In-Office or Remote
9 Locations
11 Employees

Pragmatike Logo Pragmatike

Senior Site Reliability Engineer

Information Technology • Software
In-Office or Remote
9 Locations
11 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account