System Software Engineer

Reposted 3 Days Ago
San Jose, CA
In-Office
200K-275K Annually
Mid level
Artificial Intelligence • Hardware • Software
The Role
Develop and maintain critical system software components such as BIOS and BMC firmware, focusing on performance, security, and reliability of server infrastructures.
Summary Generated by Built In

About Etched

Etched is building the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers from NVIDIA, Google, and Meta, Etched is redefining the infrastructure layer for the fastest growing industry in history.

Job Summary
We are seeking a highly skilled and motivated System Software Engineer to join our team, responsible for the foundational software that powers our server infrastructure. This role focuses on the development, integration, and debugging of critical system software components, including BIOS, BMC firmware, boot processes (including NetBoot), root of trust implementations, advanced system logging, and kernel-mode drivers. You will play a pivotal role in ensuring the reliability, security, and performance of our server platforms, and contribute to the integration of data center orchestration technologies at the node level.

Key Responsibilities:

  • Firmware and Boot Process Development: Design, develop, and maintain BIOS and BMC firmware, ensuring robust and efficient server boot processes, including NetBoot implementations.

  • Measure and Tune System Performance Configuration: Analyze DRAM timings, PCIe configurations, power state transitions etc. to ensure high performance and maximal reliability.

  • Root of Trust and Security: Implement and maintain security features, including root of trust mechanisms, to protect system integrity and data security.

  • Kernel-Mode Driver Development and Debugging: Develop and debug kernel-mode drivers, ensuring seamless hardware integration and optimal system performance.

  • Advanced System Logging and Diagnostics: Design and implement advanced system logging and diagnostic capabilities to facilitate efficient troubleshooting and performance analysis.

  • Data Center Orchestration Integration: Integrate and optimize node-level data center orchestration technologies, such as Kubernetes and Docker, into the system software stack.

  • System Validation and Testing: Develop and execute comprehensive test plans to validate system software functionality, stability, and performance.

  • Collaboration and Troubleshooting: Collaborate with hardware and software teams to diagnose and resolve complex system-level issues.

Representative Projects:

  • Implement and validate secure boot processes, including root of trust verification.

  • Develop and debug kernel-mode drivers for new hardware peripherals.

  • Design and implement advanced system logging and monitoring solutions.

  • Optimize BIOS and BMC firmware for improved boot times and system stability.

  • Integrate node-level container orchestration capabilities into the system software.

  • Analyze and resolve complex system-level issues related to boot failures, hardware errors, and performance degradation.

  • Analyze and optimize system level logging for large scale server deployments.

  • Implement and debug NetBoot processes for large server deployments.

Must-Have Skills and Experience:

  • Proficiency in C/C++.

  • Strong understanding of BIOS and BMC firmware architectures.

  • Experience with server boot processes (EFI, UEFI), and NetBoot technologies.

  • Knowledge of root-of-trust and security principles.

  • Experience with kernel-mode driver development and debugging.

  • Strong understanding of operating systems (Linux preferred) and server hardware architectures.

  • Experience with advanced system logging and diagnostic tools.

  • Ability to analyze complex technical problems and provide effective solutions.

  • Excellent communication and collaboration skills.

  • Experience with version control systems (e.g., Git).

  • Experience with reading and interpreting hardware logs.

Nice-to-Have Skills and Experience:

  • Experience with data center orchestration technologies (Kubernetes, Docker).

  • Experience with hardware diagnostic tools and techniques.

  • Knowledge of server virtualization.

  • Experience with tracing tools like perf, eBPF, ftrace, etc.

  • Experience with performance testing and benchmarking tools (gProf, vTune, Wireshark, etc.).

  • Experience with CI/CD pipelines.

  • Experience with Rust.

Benefits

  • Full medical, dental, and vision packages, with generous premium coverage

  • Housing subsidy of $2,000/month for those living within walking distance of the office

  • Daily lunch and dinner in our office

  • Relocation support for those moving to West San Jose

How we’re different

Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

Top Skills

Bios
Bmc Firmware
C/C++
Docker
Efi
Git
Kubernetes
Linux
Uefi
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Cupertino, CA
53 Employees
Year Founded: 2022

What We Do

By burning the transformer architecture into our chips, we’re creating the world’s most powerful servers for transformer inference.

Similar Jobs

Taara Logo Taara

Software Engineer

Information Technology • Software
In-Office
Sunnyvale, CA, USA
63 Employees
160K-210K

NVIDIA Logo NVIDIA

Senior System Software Bringup Engineer

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office or Remote
2 Locations
21960 Employees
148K-288K

NVIDIA Logo NVIDIA

Software Test Engineer

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office
Santa Clara, CA, USA
21960 Employees
148K-288K

NVIDIA Logo NVIDIA

Software Engineer

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office or Remote
3 Locations
21960 Employees
168K-322K

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account