About Voltai
Voltai is developing world models, and agents to learn, evaluate, plan, experiment, and interact with the physical world. We are starting out with understanding and building hardware; electronics systems and semiconductors where AI can design and create beyond human cognitive limits.
About the Team
Backed by Silicon Valley’s top investors, Stanford University, and CEOs/Presidents of Google, AMD, Broadcom, Marvell, etc. We are a team of previous Stanford professors, SAIL researchers, Olympiad medalists (IPhO, IOI, etc.), CTOs of Synopsys & GlobalFoundries, Head of Sales & CRO of Cadence, former US Secretary of Defense, National Security Advisor, and Senior Foreign-Policy Advisor to four US presidents.
About the Role
You will develop, integrate, and optimize state-of-the-art CUDA kernels to power AI models that accelerate semiconductor design and verification. Your work will enable large-scale model training, inference, and reinforcement learning systems that reason about circuit layouts, generate and validate RTL, and optimize chip architectures — running efficiently across thousands of GPUs.
You’ll build tools, performance benchmarks, and integration layers that push the limits of GPU utilization for compute-intensive workloads in AI-driven hardware design. Working closely with researchers and engineers, you’ll help make Voltai the world’s leading AI + semiconductor research organization. You’ll also release your kernels and tooling as contributions to the open-source AI and HPC ecosystems.
You might thrive in this role if you have experience with
Writing and optimizing CUDA kernels for large-scale AI workloads (attention, routing, graph-based operations, physics-inspired operators, etc.)
Profiling and optimizing GPU performance for custom compute or memory-bound workloads
Integrating custom kernels into cutting-edge training and inference frameworks (e.g., PyTorch, Megatron, vLLM, TorchTitan)
Working with the latest NVIDIA hardware and software stacks (Hopper, Blackwell, NVLink, NCCL, Triton)
Building GPU-accelerated primitives for graph reasoning, symbolic computation, or hardware simulation tasks
Collaborating with AI researchers and semiconductor experts to translate domain-specific workloads into high-performance GPU code
Top Skills
What We Do
AI models for electronics








