Sr Software Engineer

Reposted 23 Days Ago
Hiring Remotely in Liberty Lake, WA
In-Office or Remote
Senior level
Hardware • Information Technology • Software
The Role
The Senior Software Engineer will develop high-performance software for executing open-source LLMs on custom hardware, focusing on optimizations and efficient libraries primarily in C++.
Summary Generated by Built In

About Us:

Positron.ai specializes in developing custom hardware systems to accelerate AI inference.  These inference systems offer significant performance and efficiency gains over traditional GPU-based systems, delivering advantages in both performance per dollar and performance per watt.  Positron exists to create the world's best AI inference systems.



Senior Software Engineer – Machine Learning Systems & High-Performance LLM Inference


We are seeking a Senior Software Engineer to contribute to the development of high-performance software that powers execution of open-source large language models (LLMs) on our custom appliance. This appliance leverages a combination of FPGAs and x86 CPUs to accelerate transformer-based models. The software stack is written primarily in modern C++ (C++17/20) and heavily relies on templates, SIMD optimizations, and efficient parallel computing techniques.

Key Areas of Focus & Responsibilities
  • Design and implement high-performance inference software for LLMs on custom hardware.
  • Develop and optimize C++-based libraries that efficiently utilize SIMD instructions, threading, and memory hierarchy.
  • Work closely with FPGA and systems engineers to ensure efficient data movement and computational offloading between x86 CPUs and FPGAs.
  • Optimize model execution via low-level optimizations, including vectorization, cache efficiency, and hardware-aware scheduling.
  • Contribute to performance profiling tools and methodologies to analyze execution bottlenecks at the instruction and data flow levels.
  • Apply NUMA-aware memory management techniques to optimize memory access patterns for large-scale inference workloads.
  • Implement ML system-level optimizations such as token streaming, KV cache optimizations, and efficient batching for transformer execution.
  • Collaborate with ML researchers and software engineers to integrate model quantization techniques, sparsity optimizations, and mixed-precision execution.
  • Ensure all code contributions include unit, performance, acceptance, and regression tests as part of a continuous integration-based development process.
Required Skills & Experience
  • 7+ years of professional experience in C++ software development, with a focus on performance-critical applications.
  • Strong understanding of C++ templates and modern memory management.
  • Hands-on experience with SIMD programming (AVX-512, SSE, or equivalent) and intrinsics-based vectorization.
  • Experience in high-performance computing (HPC), numerical computing, or ML inference optimization.
  • Experience with ML model execution optimizations, including efficient tensor computations and memory access patterns.
  • Knowledge of multi-threading, NUMA architectures, and low-level CPU optimization.
  • Proficiency with systems-level software development, profiling tools (perfetto, VTune, Valgrind), and benchmarking.
  • Experience working with hardware accelerators (FPGAs, GPUs, or custom ASICs) and designing efficient software-hardware interfaces.
Preferred Skills (Nice to Have)
  • Familiarity with LLVM/Clang or GCC compiler optimizations.
  • Experience in LLM quantization, sparsity optimizations, and mixed-precision computation.
  • Knowledge of distributed inference techniques and networking optimizations.
  • Understanding of graph partitioning and execution scheduling for large-scale ML models.
Why Join Us?
  • Work on a cutting-edge ML inference platform that redefines performance and efficiency for LLMs.
  • Tackle challenging low-level performance engineering problems in AI and HPC.
  • Collaborate with a team of hardware, software, and ML experts building an industry-first product.
  • Opportunity to contribute to and shape the future of open-source AI inference software.


Top Skills

Avx-512
C++
Clang
Fpga
Gcc
Hpc
Llvm
Ml
Simd
Sse
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Reno, NV
42 Employees
Year Founded: 2023

What We Do

Positron delivers vendor freedom and faster inference for both enterprises and research teams, by allowing them to use hardware and software explicitly designed from the ground up for generative and large language models (LLMs).

Through lower power usage and drastically lower total cost of ownership (TCO), Positron enables you to run popular open source LLMs to serve multiple users at high token rates and long context lengths. Positron is also designing its own ASIC to expand from inference and fine tuning to also support training and other parallel compute workloads.

Similar Jobs

Remote
United States
575 Employees
150K-180K Annually

Deepgram Logo Deepgram

Senior Software Engineer

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
Remote
USA
150 Employees
150K-220K Annually

Rula Logo Rula

Senior Software Engineer

Healthtech • Other • Social Impact • Software • Telehealth
Remote
United States
595 Employees

Upstart Logo Upstart

Senior Software Engineer

Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
Easy Apply
Remote
United States
1500 Employees
164K-226K Annually

Similar Companies Hiring

PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account