Lead Architect, RunTime

Job Posted 17 Days Ago Reposted 17 Days Ago
Be an Early Applicant
Palo Alto, CA
Senior level
Artificial Intelligence • Hardware • Machine Learning • Natural Language Processing • Software • Semiconductor • Generative AI
SambaNova is the #1 platform for business AI.
The Role
The Lead Architect, Runtime will design and optimize a high-performance, distributed software runtime for advanced AI workloads. Responsibilities include architecting the runtime stack, overseeing hardware-software integration, driving technical strategy, and mentoring team members to ensure efficient and scalable software solutions.
Summary Generated by Built In

The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale.

SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets.

We’re seeking a Lead Architect, Runtime to join our talented Runtime team—a group of engineers who have a proven track record of building software that directly powers advanced AI workloads and scientific computing. As a key technical leader, you will be responsible for designing and architecting a high-performance, distributed, and scalable software runtime that supports our broad array of data-flow applications, including machine learning training and inference, data processing pipelines (ETL), and HPC applications.

In this role, you will have the opportunity to define and deliver the architecture of our entire runtime stack, driving everything from OS-level integration to performance profiling, networking, and optimization, while working closely with hardware teams to design the most efficient systems.

Key Responsibilities:

  • Architectural Leadership: Lead the design, development, and performance optimization of the software runtime stack, ensuring it meets the high-performance and scalability requirements of ML, AI, and HPC applications.
  • Runtime System Design: Architect embedded software infrastructure to enable smooth integration of high-level applications with the underlying hardware, including OS interface/integration, partitioned workload orchestration, fault management, and inter-node communication.
  • Hardware Interaction: Oversee and guide the low-level integration between software and hardware components, ensuring efficient chipset initialization, monitoring, and fault management.
  • Technical Strategy: Drive the technical direction for the Runtime Engineering team, ensuring the design and implementation of software that delivers performance and scales efficiently with our next-generation AI hardware and platforms based on our Reconfigurable Dataflow Architecture.
  • Tooling and Profiling: Lead the design and development of tools and performance profilers, empowering customers to configure, deploy, and optimize their workloads on SambaNova’s Datascale systems.
  • Mentorship and Team Development: Inspire and guide the team to continuously improve development processes, coding standards, and collaboration practices. Foster a culture of excellence, accountability, and technical growth.
  • Cross-functional Leadership: Collaborate with hardware, software, and product teams to define requirements and ensure seamless integration between hardware and system software components.

Skills and Qualifications:

  • Strong Software Engineering Background: Proven experience building, testing, and tuning software for distributed, high-performance systems. In-depth knowledge of operating systems and runtime stacks.
  • Real-Time Operating Systems (RTOS): Hands-on experience with RTOS and system-level software that directly interfaces with hardware.
  • High-Performance Computing (HPC): Expertise in designing and optimizing systems that handle massive parallel workloads, including machine learning training and inference tasks that involve billions of operations per second.
  • Low-Level System Understanding: Deep understanding of hardware-software interaction, including registers, device memory management, and the intricacies of accelerator design. Experience working with ASIC accelerators is highly desirable.
  • Distributed Systems Expertise: Familiarity with distributed systems architecture, including networking, communication protocols, and the challenges of scaling compute resources efficiently.
  • Toolchain Expertise: Hands-on experience with software development tools such as Git, Jenkins, and Jira, with an ability to drive automation and continuous integration efforts.
  • Cross-Disciplinary Knowledge: Ability to work at the intersection of hardware and software, designing systems that optimize both performance and reliability.

Preferred Experience:

  • ASIC/FPGA Expertise: Experience designing or working closely with custom hardware accelerators (ASICs, FPGAs, etc.) and understanding low-level interactions.
  • Cloud and Data Center Experience: Familiarity with deploying high-performance systems in distributed, cloud, or data center environments.

What We Offer:

  • Opportunity to work on cutting-edge technologies that power the next generation of AI and ML applications.
  • A collaborative, dynamic environment where your ideas and leadership will have a direct impact on the success of the company.
  • A chance to work with some of the brightest minds in the industry and contribute to groundbreaking innovations in AI, HPC, and distributed computing.

If you're passionate about designing high-performance, distributed systems and want to lead the architectural evolution of AI infrastructure, we want to hear from you.

Annual Salary Range and Level

The base salary for this position ranges from $200,000/year up to $250,000/year. This range is based on role, level, and location and reflects the salary target for new hires in the US. Individual pay within the range will depend on a number of factors, including a candidate’s job-related qualifications, skills, competencies and experience, and location.

#LI-CK1


Submission Guidelines
Please note that in order to be considered an applicant for any position at SambaNova Systems, you must submit an application form for each position for which you believe you are qualified. 

EEO Policy
SambaNova Systems is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard basis of age (40 and over), color, disability, gender identity, genetic information, marital status, military or veteran status, national origin/ancestry, race, religion, creed, sex (including pregnancy, childbirth, breastfeeding), sexual orientation, and any other applicable status protected by federal, state, or local laws.

Benefits Summary for US-Based, Full-Time Employment Positions
SambaNova offers a competitive total rewards package, including the base salary, plus equity and benefits. We cover 95% premium coverage for employee medical insurance, and 77% premium coverage for dependents and offer a Health Savings Account (HSA) with employer contribution. We also offer Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans in addition to Flexible Spending Account (FSA) options like Health Care, Limited Purpose, and Dependent Care. Our library of well-being benefits available to you and your dependents includes a full subscription to Headspace, Gympass+ membership with access to physical gyms, One Medical membership, counseling services with an Employee Assistance Program, and much more.

Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Palo Alto, CA
500 Employees
Hybrid Workplace
Year Founded: 2017

What We Do

AI is changing the world and at SambaNova, we believe that you don’t need unlimited resources to take advantage of the most advanced, valuable AI capabilities - capabilities that are helping organizations explore the universe, find cures for cancer, and giving companies access to insights that provide a competitive edge.

We deliver the world’s fastest and only complete AI solution for enterprises and governments with world-record inference performance and accuracy. Powered by the SambaNova SN40L Reconfigurable Dataflow Unit (RDU), organizations can build a technology backbone for the next decade of AI innovation with SambaNova Suite. Our fully integrated hardware-software system, DataScale®, enables organizations to train, fine-tune, and deploy the most demanding AI workloads using the largest and most challenging models. Most recently, with the launch of our newest offering, SambaNova Cloud, developers can supercharge AI-powered applications on Llama 3.2 models.

SambaNova was founded in 2017 in Palo Alto, California, by a group of industry luminaries, business leaders, and world-class innovators who understand AI. Today, we’ve built an incredibly smart and motivated team dedicated to making a lasting impact on the industry and equipping our customers to thrive in the new era of AI.

Why Work With Us

As a talent first company, we aim to hire the greatest and most innovative minds in the industry- driving the next generation of AI computing where no barrier is too high and the possibilities are truly limitless. We encourage our peers to take risks and take the initiative to make a lasting impact on the AI and ML industries.

Gallery

Gallery

Similar Jobs

ServiceNow Logo ServiceNow

Senior Software Engineering Manager

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Hybrid
San Diego, CA, USA
26000 Employees
169K-296K Annually

ServiceNow Logo ServiceNow

Senior Staff Software Engineer - Backend (Persistence)

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Hybrid
San Diego, CA, USA
26000 Employees
178K-312K Annually

ServiceNow Logo ServiceNow

Senior Staff Software Engineer - Backend (Persistence)

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Hybrid
Santa Clara, CA, USA
26000 Employees
198K-346K Annually
Hybrid
10 Locations
2674 Employees

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account