About Celestial AI
As Generative AI continues to advance, the performance drivers for data center infrastructure are shifting from systems-on-chip (SOCs) to systems of chips. In the era of Accelerated Computing, data center bottlenecks are no longer limited to compute performance, but rather the system’s interconnect bandwidth, memory bandwidth, and memory capacity. Celestial AI’s Photonic Fabric™ is the next-generation interconnect technology that delivers a tenfold increase in performance and energy efficiency compared to competing solutions.
The Photonic Fabric™ is available to our customers in multiple technology offerings, including optical interface chiplets, optical interposers, and Optical Multi-chip Interconnect Bridges (OMIB). This allows customers to easily incorporate high bandwidth, low power, and low latency optical interfaces into their AI accelerators and GPUs. The technology is fully compatible with both protocol and physical layers, including standard 2.5D packaging processes. This seamless integration enables XPUs to utilize optical interconnects for both compute-to-compute and compute-to-memory fabrics, achieving bandwidths in the tens of terabits per second with nanosecond latencies.
This innovation empowers hyperscalers to enhance the efficiency and cost-effectiveness of AI processing by optimizing the XPUs required for training and inference, while significantly reducing the TCO2 impact. To bolster customer collaborations, Celestial AI is developing a Photonic Fabric ecosystem consisting of tier-1 partnerships that include custom silicon/ASIC design, system integrators, HBM memory, assembly, and packaging suppliers.
DESCRIPTION
As Compiler Backend Engineer, you will be a key player in expanding the Backend functionality and optimizations for the Celestial AI Machine Learning accelerator architecture. We have opportunities in the areas of code generation, partitioning, kernel generation, and runtime interfaces. This role is highly collaborative with Architecture, Hardware, and ML Operations and Developers to ensure Compiler requirements and tools to achieve exceptional performance are met. The Hardware and Instruction Set Architecture (ISA) are driven from a Compiler perspective with the goal of rapidly delivering full functionality and performance across a broad spectrum of ML models with minimal Compiler complexity. The compiler design flow is iterative and fast-paced.
ESSENTIAL DUTIES AND RESPONSIBILITIES
- Design, prototype, and expand the functionality and performance of the compiler in the areas of code generation, partitioning, kernel generation, and/or runtime interfaces
- Benchmark, test, and analyze output produced by the compiler
- Work with HW teams and architects to drive improvements in architecture and SW compiler
QUALIFICATIONS
- BS or MS in Computer Science, Engineering, or related field
- 2+ years of experience in compilers for data parallel architectures
- Experience in the field of compilers for Machine Learning inference models, preferred
- Experience with the Apache TVM Hardware Backend codebase and workflow for custom code generators (BYOC), preferred
- Effective communicator and collaborator
We offer great benefits (health, vision, dental and life insurance), collaborative and continuous learning work environment, where you will get a chance to work with smart and dedicated people engaged in developing the next generation architecture for high performance computing.
Celestial AI Inc. is proud to be an equal opportunity workplace and is an affirmative action employer.
#LI-Onsite
Top Skills
What We Do
Celestial AI is a Machine Learning (ML) accelerator company that has developed a proprietary technology platform which enables the next generation of high-performance computing solutions. Celestial AI’s mission is to transform data parallel computing with a proprietary Photonic Fabric™ technology platform which uses light for data movement both within chip and between chips.
Advancements in data communications have driven robust silicon photonics technology and volume manufacturing ecosystems that are ripe for commercial implementation of ML and high-performance computing (HPC) solutions which leverage integrated silicon photonics for data movement.
Celestial AI’s system delivers differentiated single node performance that scales efficiently, providing significant performance gains for multi-node and multi model applications. The scalability of Celestial AI’s accelerator architecture enables an efficient and performant mapping of data and compute over a broad range of ML model types without the need for complex software optimizations. Celestial AI’s competitive advantage will further grow over time as ML models continue to increase in complexity and size.
Celestial AI has assembled a highly experienced team of industry leaders who have a track record of building multiple successful technology businesses. The company’s Orion AI accelerator products serve an addressable market that is projected by Omida to exceed $70 billion in 2025