Software Engineer, Runtime

Reposted 24 Days Ago
Be an Early Applicant
Seoul, KOR
In-Office
Mid level
Artificial Intelligence • Information Technology • Software • Database • Manufacturing
The Role
Design and implement low-level runtime systems for NPU hardware, focusing on DMA-based I/O operations, kernel scheduling, and optimizing performance for inference workloads.
Summary Generated by Built In
About the Job

Designs and implements the low-level runtime stack that drives FuriosaAI's NPU hardware to its theoretical limits — from device driver interfaces and DMA-based I/O to kernel execution scheduling, multi-node inference, and embedded firmware.

Responsibilities
  • Develops the low-level runtime responsible for DMA-based I/O operations and kernel execution scheduling, maximizing inference throughput while minimizing end-to-end latency.

  • Builds and optimizes asynchronous execution pipelines that orchestrate data movement and compute across the NPU hardware.

  • Enables multi-node inference by implementing foundational communication primitives, including RDMA-based data transfer for low-latency, high-bandwidth inter-node operations.

  • Develops embedded firmware (PERT) that runs on the NPU's integrated ARM core, managing on-device scheduling, synchronization, and hardware resource control.

  • Profiles and tunes system-level performance across the full runtime stack — from firmware to user-space — to eliminate bottlenecks in real-world inference workloads.

Minimum Qualifications
  • Bachelor's degree in Computer Science or equivalent work experience. Strong systems programming background with 3+ years of experience in Rust, C, or C++.

  • Bachelor's degree in Computer Science, Electrical Engineering, or equivalent work experience.

  • Strong communication skills for cross-team requirement gathering and technical alignment.

  • 3+ years of systems programming experience in Rust, C, or C++.

  • Solid understanding of computer architecture fundamentals: memory hierarchy, cache coherency, OS, DMA, interrupts, and MMIO.

Preferred Qualifications
  • Deep expertise in low-latency runtime systems, embedded firmware development, or high-performance I/O — especially in the context of accelerator hardware.

  • Experience designing and implementing low-latency asynchronous execution models and scheduling systems.

  • Experience with DMA engines, scatter-gather I/O, or other zero-copy data transfer mechanisms.

  • Experience developing embedded firmware for ARM-based processors (bare-metal or lightweight RTOS environments).

  • Familiarity with RDMA technologies and high-performance networking for distributed or multi-node systems.

  • Experience with CUDA low-level runtime internals such as CUDA Graphs, stream-based execution, and asynchronous kernel launch optimization.

  • Experience with kernel-level performance optimizations (e.g., Linux kernel modules, eBPF, perf, ftrace).

  • Understanding of deep learning inference workloads and their hardware execution characteristics.

  • Experience with profiling and performance tuning of system software on accelerator or SoC platforms.

Contact

Skills Required

  • Bachelor's degree in Computer Science or equivalent work experience
  • 3+ years of systems programming experience in Rust, C, or C++
  • Solid understanding of computer architecture fundamentals
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Seoul, Seoul
143 Employees
Year Founded: 2017

What We Do

FuriosaAI designs and develops data center accelerators for the most advanced AI models and applications. Our mission is to make AI computing sustainable so everyone on Earth has access to powerful AI. Our Background Three misfit engineers with each from HW, SW and algorithm fields who had previously worked for AMD, Qualcomm and Samsung got together and founded FuriosaAI in 2017 to build the world’s best AI chips. The company has raised more than $100 million, with investments from DSC Investment, Korea Development Bank, and Naver, the largest internet provider in Korea. We have partnered on our first two products with a wide range of industry leaders including TSMC, ASUS, SK Hynix, GUC, and Samsung. FuriosaAI now has over 140 employees across Seoul, Silicon Valley, and Europe. Our Approach We are building full stack solutions to offer the most optimal combination of programmability, efficiency, and ease of use. We achieve this through a “first principles” approach to engineering: We start with the core problem, which is how to accelerate.

Similar Jobs

Atlassian Logo Atlassian

Solutions Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
In-Office or Remote
Seoul, KOR
11000 Employees

HERE Technologies Logo HERE Technologies

Senior Product Manager

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Remote or Hybrid
Seoul, KOR
6000 Employees

HERE Technologies Logo HERE Technologies

Senior Software Engineer

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Hybrid
Seoul, KOR
6000 Employees

Boeing Logo Boeing

Application Engineer

Aerospace • Information Technology • Software • Cybersecurity • Design • Defense • Manufacturing
In-Office
Seoul, KOR
170000 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account