Software Engineer, Site Reliability

Reposted 3 Days Ago
Be an Early Applicant
Hiring Remotely in Turkey
Remote
Senior level
Cloud • Digital Media • Information Technology
Generative media platform for developers.
The Role
Manage the reliability and availability of customer-facing systems by operating Kubernetes infrastructure, CI/CD pipelines, monitoring systems, and driving reliability improvements through automation.
Summary Generated by Built In

You are a seasoned SRE who keeps production infrastructure running at scale. You own the reliability and availability of customer-facing systems — from Kubernetes clusters to deployment pipelines to the networking layer that connects it all. You think in SLOs, automate ruthlessly, and treat every incident as a chance to make the system better.

Key Responsibilities
  • Own and operate our Kubernetes infrastructure: cluster lifecycle, upgrades, networking, and multi-tenant isolation for customer workloads
  • Build and maintain CI/CD pipelines and deployment infrastructure
  • Leverage AI to an extreme level to automate analysis and resolution of production issues, and improve software development speed, reliability and maintainability
  • Build dashboards, alerting, and anomaly detection across our systems
  • Define and enforce SLOs and build out incident response processes
  • Manage and improve our networking, load balancing, and service mesh configurations
  • Drive reliability improvements across the stack through automation, runbooks, and chaos engineering
Requirements
  • 5+ years experience in managing critical production systems and software development workflows
  • Strong production experience setting up and operating Kubernetes at scale, using infrastructure-as-code (Terraform, Ansible)
  • Deep knowledge of Linux networking, container networking (CNI plugins, VXLAN, BGP), and DNS
  • Experience building CI/CD systems and GitOps workflows (FluxCD, ArgoCD)
  • Proficiency in Python and either Go or Bash for tooling and automation
  • Strong experience with logging, monitoring and alerting (Prometheus, Grafana, Loki, Thanos, VictoriaMetrics, Datadog)
  • Excellent communication and ability to drive technical decisions across teams
  • Self-starter who executes quickly, takes ownership, and constantly seeks improvement
Nice to have
  • Experience with managing GPU and AI/ML workloads
  • Experience with kernel-based monitoring and routing (eBPF, XDP)
  • Experience with security tooling (Falco, Coroot, SIEM)
  • Experience with bare metal Kubernetes networking (Calico, Cilium, MetalLB)
  • Experience with distributed storage systems (Ceph, Longhorn, etc.)
Location
  • Turkey

What we offer at fal
  • Interesting and challenging work
  • A lot of learning and growth opportunities
  • Regular team events and offsites
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
73 Employees

What We Do

Generative Media Cloud

Similar Jobs

Pfizer Logo Pfizer

Senior Business Finance Centre Partner

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Remote or Hybrid
2 Locations
121990 Employees

GitLab Logo GitLab

Database Engineer

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
31 Locations
2500 Employees
158K-338K Annually

Smartling Logo Smartling

Don't see the role you're looking for currently available? Apply here.

Artificial Intelligence • Cloud • Information Technology • Machine Learning • Natural Language Processing • Software
Easy Apply
Remote
28 Locations
117 Employees

JumpCloud Logo JumpCloud

Software Engineer

Cloud • Information Technology • Security • Software
Easy Apply
Remote
Ankara, Çankaya, Ankara, TUR
800 Employees

Similar Companies Hiring

Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account