NextSilicon

Serbia- AI Workloads Engineer

Sorry, this job was removed at 06:13 p.m. (CST) on Monday, Apr 06, 2026

Be an Early Applicant

Hiring Remotely in Serbia

Remote

Hardware • Software

The Role

NextSilicon is reimagining high-performance computing (HPC & AI). Our accelerated compute solutions leverage intelligent adaptive algorithms to vastly accelerate supercomputers, driving them forward into a new generation. We have developed a novel software-defined hardware architecture that is achieving significant advancements in both the HPC and AI domains.

At NextSilicon, everything we do is guided by three core values:

Professionalism: We strive for exceptional results through professionalism and unwavering dedication to quality and performance.
Unity: Collaboration is key to success. That's why we foster a work environment where every employee can feel valued and heard.
Impact: We're passionate about developing technologies that make a meaningful impact on industries, communities, and individuals worldwide.

The AI Workloads team is responsible for modeling and enabling end-to-end AI workflows on NextSilicon’s next-generation hardware platforms. As an AI Workloads Engineer in Belgrade, you’ll build workflow modeling infrastructure, run and adapt open-source AI systems, and use real workloads to drive performance improvements from chip design through production.

4+ years of experience in software engineering.
Strong Python and PyTorch development experience.
Solid understanding of LLMs and modern inference workflows (e.g., KV cache, paged attention, speculative/assisted decoding, batching/scheduling)
Experience running, profiling, and instrumenting open-source AI inference systems (e.g., vLLM or similar)
Proficiency in C++ for developing software that models or interacts with hardware execution behavior (latency, dataflow, memory access patterns).
Experience with distributed inference and collectives (e.g., NCCL) and parallelism strategies (TP/PP/EP) is an advantage
Experience with dynamic batching systems (e.g., vLLM, TensorRT-LLM) is an advantage
Familiarity with MLPerf Inference benchmarks and methodology (Server/Offline, latency constraints, request arrival patterns) is an advantage
Experience programming custom kernels (e.g., CUDA, Triton, or similar) is an advantage
Background in performance analysis, simulation, compiler/runtime profiling, or workload modeling is an advantage

View all jobs at NextSilicon

View NextSilicon Profile

Report Job

Similar Jobs

NextSilicon

Serbia- AI Workloads Engineer

Hardware • Software

Remote or Hybrid

Serbia

280 Employees

Circle (circle.so)

People Partner

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software

Easy Apply

Remote

250 Employees

90K-120K Annually

Circle (circle.so)

Lead Product Designer

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software

Easy Apply

Remote

250 Employees

140K-170K Annually

Circle (circle.so)

Lead Product Designer

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software

Easy Apply

Remote

250 Employees

140K-170K Annually

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Giv'atayim

280 Employees

Year Founded: 2017

What We Do

We believe in a smarter future and want to create new opportunities for innovation. In order to achieve this, we’re rethinking compute architectures for the future of computer processing.