AI Kernel Engineer

Posted 11 Days Ago
Be an Early Applicant
Pune, Mahārāshtra
In-Office
Senior level
Hardware • Machine Learning • Software
The Role
The AI Kernel Engineer will develop and optimize AI kernels for inference on Quadric's platform, analyze performance bottlenecks, and improve toolchains.
Summary Generated by Built In

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.
Role

The AI Kernel Engineer in Quadric plays the key role to enable a large number of AI kernels/operators to run efficiently on the Quadric platform. The AI Kernel Engineer at Quadric will [1] develop a highly efficient Quadric kernel library for a variety of AI/LLM models; [2] analyze the performance and optimize the kernel for different hardware configurations; This senior technical role demands deep knowledge of hardware architecture, compiler toolchain and optimization techniques.

Responsibilities

  • Develop AI/LLM kernels/operators on Quadric platform for efficient inference
  • Optimize the kernel performance for different hardware configurations and workloads
  • Profile and analyze kernel performance in terms of compute, data and parallelism; identify micro-architecture and software bottlenecks and provide optimization solutions
  • Optimize kernel C/C++ codes, maximize hardware utilization
  • Collaborate across related areas of the AI inference stack to support team and business priorities
  • Make Improvement to Quadric toolchain, compiler and runtime
  • Provide technical support and documents to customers and developer community

Requirements
  • Bachelor’s or Master’s in Computer Science and/or Electric Engineering
  • 5+ years of experience in AI kernel development and optimization
  • experience with model and kernel inference performance profiling
  • experience with at least one of the following compute development: CUDA, DSP, NEON, Triton-lang
  • Proficiency in C/C++ and Python, experience with assembly language a plus
  • Demonstrate good capability in problem solving, debug and communication

Benefits
  • Provide competitive salaries and meaningful equity
  • Provide a politics free community for the brilliant minds who want to make an immediate impact
  • Provide an opportunity for you to build long term career relationships
  • Foster an environment that allows for lasting personal relationships alongside professional one

Founded in 2016 and based in downtown Burlingame, California, Quadric is building the world’s first supercomputer designed for the real-time needs of edge devices. Quadric aims to empower developers in every industry with superpowers to create tomorrow’s technology, today. The company was co-founded by technologists from MIT and Carnegie Mellon, who were previously the technical co-founders of the Bitcoin computing company 21.

Quadric is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, religion, sex, national origin, sexual orientation, age, citizenship, marital status, or disability.

Top Skills

C++
Cuda
Dsp
Neon
Python
Triton-Lang
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Burlingame, CA
38 Employees
Year Founded: 2017

What We Do

Quadric has built a unified hardware/software architecture optimized for on-device machine learning inference. Only the Quadric GPNPU (general purpose neural processing unit) delivers high ML inference performance while also running C++ code without forcing the developer to artificially partition application code between two or three different kinds of processors. Quadric's GPNPU is a licensable processor IP core that scales from 1 to 64 TOPs and seamlessly intermixes scalar, vector and matrix code.

Similar Jobs

CrowdStrike Logo CrowdStrike

Engineering Manager

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
Pune, Mahārāshtra, IND
10000 Employees

CrowdStrike Logo CrowdStrike

Sr. MDM Engineer (Remote, IND)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
India
10000 Employees

CrowdStrike Logo CrowdStrike

Engineer III - Backend

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
Pune, Mahārāshtra, IND
10000 Employees

CrowdStrike Logo CrowdStrike

Marketing Campaign Specialist

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
Pune, Mahārāshtra, IND
10000 Employees

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account