FuriosaAI Jobs

Software Engineer, Runtime

FuriosaAI

Software Engineer, Runtime

Sorry, this job was removed at 02:12 p.m. (UTC) on Monday, Jul 20, 2026

Be an Early Applicant

Seoul, KOR

In-Office

Artificial Intelligence • Information Technology • Software • Database • Manufacturing

The Role

About the Job

Designs and implements the low-level runtime stack that drives FuriosaAI's NPU hardware to its theoretical limits — from device driver interfaces and DMA-based I/O to kernel execution scheduling, multi-node inference, and embedded firmware.

Responsibilities

Develops the low-level runtime responsible for DMA-based I/O operations and kernel execution scheduling, maximizing inference throughput while minimizing end-to-end latency.
Builds and optimizes asynchronous execution pipelines that orchestrate data movement and compute across the NPU hardware.
Enables multi-node inference by implementing foundational communication primitives, including RDMA-based data transfer for low-latency, high-bandwidth inter-node operations.
Develops embedded firmware (PERT) that runs on the NPU's integrated ARM core, managing on-device scheduling, synchronization, and hardware resource control.
Profiles and tunes system-level performance across the full runtime stack — from firmware to user-space — to eliminate bottlenecks in real-world inference workloads.

Minimum Qualifications

Bachelor's degree in Computer Science or equivalent work experience. Strong systems programming background with 3+ years of experience in Rust, C, or C++.

Bachelor's degree in Computer Science, Electrical Engineering, or equivalent work experience.
Strong communication skills for cross-team requirement gathering and technical alignment.
3+ years of systems programming experience in Rust, C, or C++.
Solid understanding of computer architecture fundamentals: memory hierarchy, cache coherency, OS, DMA, interrupts, and MMIO.

Preferred Qualifications

Deep expertise in low-latency runtime systems, embedded firmware development, or high-performance I/O — especially in the context of accelerator hardware.
Experience designing and implementing low-latency asynchronous execution models and scheduling systems.
Experience with DMA engines, scatter-gather I/O, or other zero-copy data transfer mechanisms.
Experience developing embedded firmware for ARM-based processors (bare-metal or lightweight RTOS environments).
Familiarity with RDMA technologies and high-performance networking for distributed or multi-node systems.
Experience with CUDA low-level runtime internals such as CUDA Graphs, stream-based execution, and asynchronous kernel launch optimization.
Experience with kernel-level performance optimizations (e.g., Linux kernel modules, eBPF, perf, ftrace).
Understanding of deep learning inference workloads and their hardware execution characteristics.
Experience with profiling and performance tuning of system software on accelerator or SoC platforms.

Contact

[email protected]

View all jobs at FuriosaAI

View FuriosaAI Profile

Report Job

Similar Jobs

Ericsson

Infrastructure Engineer

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)

In-Office

Seoul, KOR

88000 Employees

Ericsson

Software Architect

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)

In-Office

Seoul, KOR

88000 Employees

Ericsson

Platform Engineer

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)

In-Office

Seoul, KOR

88000 Employees

Ericsson

Architect

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)

In-Office

Seoul, KOR

88000 Employees

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Seoul, Seoul

143 Employees

Year Founded: 2017

What We Do

FuriosaAI designs and develops data center accelerators for the most advanced AI models and applications. Our mission is to make AI computing sustainable so everyone on Earth has access to powerful AI. Our Background Three misfit engineers with each from HW, SW and algorithm fields who had previously worked for AMD, Qualcomm and Samsung got together and founded FuriosaAI in 2017 to build the world’s best AI chips. The company has raised more than $100 million, with investments from DSC Investment, Korea Development Bank, and Naver, the largest internet provider in Korea. We have partnered on our first two products with a wide range of industry leaders including TSMC, ASUS, SK Hynix, GUC, and Samsung. FuriosaAI now has over 140 employees across Seoul, Silicon Valley, and Europe. Our Approach We are building full stack solutions to offer the most optimal combination of programmability, efficiency, and ease of use. We achieve this through a “first principles” approach to engineering: We start with the core problem, which is how to accelerate.