AI Simulation Architect

Posted 11 Days Ago
Be an Early Applicant
Hiring Remotely in United States
Remote
7+ Years Experience
Hardware • Manufacturing
The Role
The AI Simulation Architect will lead the design and development of large-scale simulation environments for high-performance computing systems, focusing on performance analysis, system optimization, and collaboration with hardware and software teams. This role involves creating efficient simulation models to enhance computational performance for AI and data-intensive applications.
Summary Generated by Built In

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

We are seeking a skilled and experienced Large-scale High-Performance Computing (HPC) and AI Simulation Architect to join our team. As an HPC Architect, you will lead the development of large-scale simulation environments for cutting-edge architectures in high-performance computing systems, enabling efficient and scalable computation for AI, scientific research, and data-intensive applications. You will work closely with cross-functional teams, including hardware engineers, software developers, and domain experts, to deliver optimized and efficient simulation environments that meet the demanding requirements of HPC workloads.

This role is Remote based out of The United States.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.


Responsibilities:

  • Design simulation models/environments for large-scale AI/HPC systems consisting of tens of thousands of computational nodes, scale-out/scale-up switches/interconnects, and heterogeneous caching/memory systems.
  • Define simulation abstraction layers to manage different levels of simulation hierarchies, from abstract analytical roofline models to detailed cycle-accurate models, balancing simulation speed and accuracy.
  • Conduct performance analysis and benchmarking, writing performance models to identify bottlenecks, optimize system parameters, and guide architectural enhancements.
  • Simulate, design, and lead the development of high-performance computing architectures that deliver exceptional computational performance, scalability, and energy efficiency.
  • Collaborate with hardware engineers to design and optimize computational components, including processors, accelerators, interconnects, and memory subsystems.
  • Work closely with software developers to define and implement software development frameworks, libraries, and tools that maximize performance and productivity on the target HPC architecture.
  • Define and recommend system-level requirements, including processing power, memory capacity, I/O bandwidth, and storage capabilities, ensuring compliance with industry standards and customer expectation
  • Evaluate and select appropriate technologies, including processors, accelerators, and network fabrics, based on application requirements, performance & power characteristics, and cost considerations.


Experience & Qualifications:

  • 15+ years of experience
  • Experience coding performance models in C++
  • Bachelor's or Master's degree in Computer Engineering, Electrical Engineering, or a related field. A Ph.D. is a plus.
  • Strong expertise in high-performance computing architecture design, including processors, accelerators, interconnects, and memory subsystems.
  • Experience developing new architectures using large scale performance simulation environments, for example GEM5 or SST
  • Experience analyzing workload behavior on large systems using open-source or custom software tools
  • Proven experience in designing and optimizing HPC architectures for scientific, research, or data-intensive applications.
  • Proficiency in parallel programming models and frameworks, such as OpenMP, MPI, CUDA, or OpenCL, and their application to HPC workloads.
  • Solid understanding of performance analysis and optimization techniques for parallel computing, including profiling, tracing, and performance counters.
  • Familiarity with industry-standard interconnects and network fabrics, such as InfiniBand, Ethernet, or Omni-Path, and their impact on HPC system performance.
  • Knowledge of memory subsystems and memory hierarchy designs, including cache coherence protocols, memory models, and NUMA architectures.
  • Experience with HPC software stack components, such as compilers, runtime systems, job schedulers, and scientific libraries.
  • Strong programming skills in languages commonly used in HPC, such as C, C++, Fortran, or Python.
  • Excellent problem-solving abilities and the ability to analyze and address complex performance and scalability challenges.
  • Strong communication and collaboration skills to work effectively with cross-functional teams and domain experts.


Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

Due to U.S. Export Control laws and regulations, Tenstorrent is required to ensure compliance with licensing regulations when transferring technology to nationals of certain countries that have been licensing conditions set  by the U.S. government.

Our engineering positions and certain engineering support positions require access to information, systems, or technologies that are subject to U.S. Export Control laws and regulations, please note that citizenship/permanent residency, asylee and refugee information and/or documentation will be required and considered as Tenstorrent moves through the employment process.

If a U.S. export license is required, employment will not begin until a license with acceptable conditions is granted by the U.S. government.  If a U.S. export license with acceptable conditions is not granted by the U.S. government, then the offer of employment will be rescinded.

Top Skills

C++
The Company
HQ: Toronto, ON
389 Employees
On-site Workplace
Year Founded: 2016

What We Do

Tenstorrent is a next-generation computing company that builds computers for AI.

Headquartered in Toronto, Canada, with U.S. offices in Austin, Texas, and Silicon Valley, and global offices in Belgrade and Bangalore, Tenstorrent brings together experts in the field of computer architecture, ASIC design, advanced systems, and neural network compilers.

Join us: www.tenstorrent.com/careers

Jobs at Similar Companies

Accuris Logo Accuris

Senior Marketing Operations Manager (Remote)

Information Technology • Machine Learning • Software • Conversational AI • Generative AI • Manufacturing
Remote
Colorado, USA
1200 Employees
110K-125K Annually

Voltage Park Logo Voltage Park

Technical Program Manager

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Remote
USA
45 Employees
150K-195K Annually

Halter Logo Halter

Senior Frontend Engineer (Pasture Team)

Greentech • Hardware • Internet of Things • Machine Learning • Software • Business Intelligence • Agriculture
Easy Apply
Hybrid
Auckland, NZL
150 Employees

Similar Companies Hiring

Voltage Park Thumbnail
Software • Other • Machine Learning • Infrastructure as a Service (IaaS) • Hardware • Cloud • Artificial Intelligence
Berkeley, CA
45 Employees
Accuris Thumbnail
Software • Manufacturing • Machine Learning • Information Technology • Generative AI • Conversational AI
Denver, CO
1200 Employees
Halter Thumbnail
Software • Machine Learning • Internet of Things • Hardware • Greentech • Business Intelligence • Agriculture
Auckland City, NZ
150 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account