Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.
As a performance architect in the dynamic and motivated Tenstorrent Platform Architecture team, you will work in a cross-functional team on ML software stacks, HPC and general purpose workloads, graph compiler, cache coherency protocols, superscalar CPU, fabric/interconnection, networking, and DPU.
This role is remote, based out of The United States.
We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.
Responsibilities:
- Collaborate with the software team and platform architecture team to understand fabric bandwidth and latency requirements and real-time constraints for AI accelerator, CPU, security, and networking traffic. Devise QoS and ordering rules among the CPU, accelerator, and IO coherent/non-coherent traffics.
- Identify representative traffic patterns for the software applications. Perform data-driven analysis to evaluate fabric topology, QoS, memory architecture , and u-architecture solutions to improve performance, power efficiency, or reduce hardware.
- Create directory-base cache coherency specification to satisfy performance requirements of coherent multiple-cluster CPU system and accelerator. Tradeoff protocol complexity and performance requirements.
- Design cache hierarchy to create best performance
- Set SoC architecture direction based on the data analysis and work with a cross-functional team to achieve the best hardware/software solutions to meet PPA goals.
- Develop a SoC cycle-accurate performance model includes memory sub-systems, directory-based coherence cache controllers, fabric interconnects, and fabric switches that describe the microarchitecture, use it for evaluation of new features.
- Collaborate with RTL and Physical design engineers to make power, performance, and area trade-offs.
- Drive analysis and correlation of performance feature both pre and post-silicon.
Experience & Qualifications:
- BS/MS/PhD in EE/ECE/CE/CS
- Strong grasp of NoC topologies, routing algorithms, queuing, traffic scheduling, and QoS requirements.
- Expertise in cache coherency protocols (AMBA CHI/AXI protocol), DDR/LPDDR/GDDR memory technology, and IO technology (PCIe/CCIX/CXL).
- Prior experience or strong understanding of traffic patterns for ML/AI algorithms in a heterogeneous computation system is a plus.
- Prior experience on formal verification of cache coherence protocols is a plus.
- Proficient in C/C++ programming. Experience in the development of highly efficient C/C++ CPU models.
Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.
Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.
Due to U.S. Export Control laws and regulations, Tenstorrent is required to ensure compliance with licensing regulations when transferring technology to nationals of certain countries that have been licensing conditions set by the U.S. government.
Our engineering positions and certain engineering support positions require access to information, systems, or technologies that are subject to U.S. Export Control laws and regulations, please note that citizenship/permanent residency, asylee and refugee information and/or documentation will be required and considered as Tenstorrent moves through the employment process.
If a U.S. export license is required, employment will not begin until a license with acceptable conditions is granted by the U.S. government. If a U.S. export license with acceptable conditions is not granted by the U.S. government, then the offer of employment will be rescinded.
Top Skills
What We Do
Tenstorrent is a next-generation computing company that builds computers for AI.
Headquartered in Toronto, Canada, with U.S. offices in Austin, Texas, and Silicon Valley, and global offices in Belgrade and Bangalore, Tenstorrent brings together experts in the field of computer architecture, ASIC design, advanced systems, and neural network compilers.
Join us: www.tenstorrent.com/careers