Software Engineer, Site Reliability Engineer

Posted 3 Days Ago
Be an Early Applicant
Seoul, KOR
Hybrid
Mid level
Artificial Intelligence • Information Technology • Software • Database • Manufacturing
The Role
The Site Reliability Engineer will enhance reliability and operability of systems through automation, observability, and architectural improvements in production infrastructure.
Summary Generated by Built In
About the Role

As a Site Reliability Engineer, you will apply software engineering to improve the reliability, scalability, security, and operability of FuriosaAI’s production infrastructure and customer-facing services. You will work across baremetal Kubernetes clusters, cloud control planes, networking, observability systems, deployment pipelines, and API services running on Furiosa NPUs.

We are looking for an engineer who can reason about production systems end-to-end, identify reliability risks across service and infrastructure boundaries, build the observability foundation required to understand them, and drive improvements through code, configuration, automation, and architectural changes.

In this role, your mission is defined by three primary pillars:

  • Reliability Architecture: Improve production systems so failures are isolated, degraded gracefully, detected quickly, and recovered safely.

  • Observability & SLOs: Build the metrics, logs, traces, dashboards, alerts, and service-level indicators required to understand user-facing reliability.

  • Production Engineering: Reduce operational toil through automation, self-service workflows, safer rollouts, and hands-on engineering contributions.

Responsibilities
  • Define and evolve reliability goals for production systems through SLIs, SLOs, error budgets, and meaningful operational metrics.

  • Design and build observability foundations that make system behavior, user impact, performance bottlenecks, and failure modes measurable and actionable.

  • Analyze production systems end-to-end, identify reliability risks across software, infrastructure, and networking boundaries, and drive architectural improvements.

  • Improve change safety and failure recovery through better rollout strategies, capacity planning, load validation, graceful degradation, and incident learning loops.

  • Reduce operational toil by building automation, internal tooling, and self-service workflows that make production systems easier to operate and harder to misuse.

Minimum Qualifications
  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.

  • Strong programming skills in one or more general-purpose languages such as Rust, Python, , or Go.

  • Solid understanding of operating systems, computer networks, and cloud-native or container-based environments.

  • Ability to analyze technical problems and communicate clearly with engineering teams.

Preferred Qualifications
  • Experience improving reliability of production systems using SLOs, observability, incident analysis, rollout safety, and error-budget-driven decision making.

  • Experience designing or operating distributed systems where failures, overload, latency, and capacity limits must be explicitly managed.

  • Experience building automation, internal tooling, or self-service workflows that reduce operational toil and improve engineering productivity.

  • Experience working across software, infrastructure, networking, and security boundaries to diagnose problems and drive architectural improvements.

Contact

Skills Required

  • Bachelor's degree in Computer Science, Engineering, or related field
  • Strong programming skills in Rust, Python, or Go
  • Understanding of operating systems and computer networks
  • Ability to analyze technical problems and communicate clearly
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Seoul, Seoul
143 Employees
Year Founded: 2017

What We Do

FuriosaAI designs and develops data center accelerators for the most advanced AI models and applications. Our mission is to make AI computing sustainable so everyone on Earth has access to powerful AI. Our Background Three misfit engineers with each from HW, SW and algorithm fields who had previously worked for AMD, Qualcomm and Samsung got together and founded FuriosaAI in 2017 to build the world’s best AI chips. The company has raised more than $100 million, with investments from DSC Investment, Korea Development Bank, and Naver, the largest internet provider in Korea. We have partnered on our first two products with a wide range of industry leaders including TSMC, ASUS, SK Hynix, GUC, and Samsung. FuriosaAI now has over 140 employees across Seoul, Silicon Valley, and Europe. Our Approach We are building full stack solutions to offer the most optimal combination of programmability, efficiency, and ease of use. We achieve this through a “first principles” approach to engineering: We start with the core problem, which is how to accelerate.

Similar Jobs

Palantir Technologies Logo Palantir Technologies

Software Engineer

Artificial Intelligence • Information Technology • Software
Hybrid
Seoul, KOR
4400 Employees

Palantir Technologies Logo Palantir Technologies

Deployment Strategist

Artificial Intelligence • Information Technology • Software
Hybrid
Seoul, KOR
4400 Employees

HERE Technologies Logo HERE Technologies

Senior Software Engineer

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Hybrid
Seoul, KOR
6000 Employees

Tapestry - Coach and Kate Spade Logo Tapestry - Coach and Kate Spade

Safety Officer (안전관리자)

eCommerce • Fashion • Retail • Sales • Wearables • Design
Hybrid
Seoul, KOR
16000 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account