Failure Analysis Engineer, Electrical Engineering

Posted 9 Days Ago
Be an Early Applicant
Hiring Remotely in Taipei, TWN
Remote
Mid level
Artificial Intelligence • Hardware • Software
The Role
The Failure Analysis Engineer will diagnose hardware failures across chip, board, and rack systems, refine debugging processes, and collaborate cross-functionally to enhance production efficiency.
Summary Generated by Built In

About Etched

Etched is building the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history.

Job Summary

Etched is hiring a Failure Analysis Engineer to own the end-to-end debug process across our full hardware stack: chip, board, and rack-scale systems.

You will be responsible for rapidly diagnosing, triaging, and resolving hardware failures; determining whether issues originate in the chip, board, or rack infrastructure; and driving resolution with the appropriate team. This is a highly cross-functional role, working closely with US-based hardware and silicon teams to build and refine debug playbooks as production scales.

The ideal candidate has deep EE fundamentals, systems-level debugging experience, and the ability to solve hard problems under pressure.

Key responsibilities

  • Own failure triage across the stack. Receive field and production failures, isolate whether the root cause is chip, board-level, or system/rack-level, and route to the appropriate team with a clear problem statement.

  • Drive root cause analysis using electrical test equipment (oscilloscopes, logic analyzers, multimeters) and system-level diagnostics to identify failure mechanisms and determine corrective actions.

  • Build and refine debug processes. Partner with US hardware counterparts to document debug flows for different failure modes, creating repeatable playbooks that scale with production volume.

  • Debug rack-level issues. Troubleshoot communication failures between rack managers, CDUs, and system components. Understand how thermal, power, and network infrastructure interact at the rack scale.

  • Interface with BMC and system firmware. Use Linux command line and BMC interfaces to pull logs, run diagnostics, and validate system health during failure investigations.

  • Close the loop on quality. Feed failure trends and root cause findings back to design, manufacturing, and operations teams to drive systemic improvements.

You May Be a Good Fit If You Have

  • Bachelor’s or Master’s degree in Electrical Engineering or a related field.

  • Fluency in oscilloscopes, signal integrity basics, power delivery, and board-level debug.

  • Systems-level thinking. Strong understanding of how servers work end-to-end: BMC, BIOS, OS, thermals, and power sequencing. Can debug issues that span multiple subsystems.

  • Linux command line proficiency. Comfortable with CI pulling logs, running scripts, and navigating server environments from the terminal.

  • Strong communication skills across teams. You can translate a complex hardware failure into a clear problem statement for silicon, firmware, or mechanical teams. You've worked across time zones and functions.

  • Composure under pressure. Production failures don't wait. You're energized by urgent, ambiguous problems and take ownership until they're resolved.

  • 3+ years of experience in hardware debug, failure analysis, or systems engineering in a server, datacenter, or semiconductor environment.

Nice to Haves

  • Rack-scale infrastructure (cooling systems, power distribution, rack managers)

  • High-speed interfaces (PCIe, Ethernet, SerDes) and their common failure modes

  • ATE or production test environments

  • Experience with Datacenters, GPUs, FPGAs, or custom ASICs

  • We encourage you to apply even if you do not believe you meet every single qualification.

Benefits

  • Competitive compensation packages, including generous equity packages

  • Comprehensive insurance coverage and other top-of-market benefit

  • US onboarding experience in our Silicon Valley headquarters

How We're Different

Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in San Jose and Taipei, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Skills Required

  • Bachelor's or Master's degree in Electrical Engineering or related field
  • 3+ years of experience in hardware debug, failure analysis, or systems engineering
  • Fluency in oscilloscopes, signal integrity basics, and board-level debug
  • Linux command line proficiency
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Cupertino, CA
53 Employees
Year Founded: 2022

What We Do

By burning the transformer architecture into our chips, we’re creating the world’s most powerful servers for transformer inference.

Similar Jobs

BlackRock Logo BlackRock

Associate - MASS Portfolio Management, Taiwan

Fintech • Information Technology • Financial Services
Remote
Taipei City, TWN
25000 Employees

Snap Inc. Logo Snap Inc.

Electrical Engineer

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Remote or Hybrid
Taipei City, TWN
5000 Employees

AdAction Logo AdAction

Senior Strategic Partnerships Manager

AdTech • Digital Media • Marketing Tech • Mobile
Remote
3 Locations
50 Employees

MongoDB Logo MongoDB

Sales Development Representative

Big Data • Cloud • Software • Database
Easy Apply
Remote or Hybrid
Taipei, TWN
5550 Employees

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account