Site Reliability Engineer

Posted 2 Days Ago
Be an Early Applicant
Sunnyvale, CA
In-Office
145K-165K Annually
Senior level
Semiconductor • Manufacturing
Bring your biggest ideas to life with the world's fastest Graphics Processors.
The Role
The Site Reliability Engineer will design, implement, and manage reliable infrastructure and services, ensuring operational excellence and uptime.
Summary Generated by Built In

Bolt Graphics is a semiconductor startup based in Sunnyvale, CA building the fastest and most efficient graphics processors. We pride ourselves on our first principles approach to solving problems. We are energized by our mission to reduce the barrier of entry for content creation and consumption. Our goal is to enable everyone to easily create, simulate and consume immersive experiences as vividly as they can imagine them.

Our Values

  • Be Fearless: Unmute yourself. Test boundaries and get proven right.
  • Remain Adaptable: Stay comfortable in a continuously changing world. If you’re wrong, concede and move on.
  • Educate Your Ego: Selflessly collaborate towards our shared purpose.


About the role:

Bolt Graphics is seeking a highly experienced Site Reliability Engineer (SRE) to design, build, and operate highly reliable developer and production systems. This role is mission-critical to maintaining uptime, performance, and operational excellence across compute, storage, and networking environments. Exceptional Linux expertise and advanced automation capabilities are mandatory for success in this role.

What you'll do:

  • Design, implement, and operate highly available, fault-tolerant infrastructure and services.
  • Install, maintain, and upgrade server, storage, and networking hardware in office and colocation facilities.
  • Continuously monitor developer and production environments and proactively remediate reliability risks.
  • Participate in an on-call rotation and lead incident response efforts, including rapid triage, mitigation, and post-incident root cause analysis.
  • Respond effectively under pressure to outages and degradation events to restore service availability.
  • Develop, maintain, and continuously improve automation and operational tooling using Bash and Python.
  • Partner closely with engineering teams to support development, testing, and production workloads at scale.

Qualifications (required):

  • Expert-level Linux systems administration across complex, production environments (this is a core requirement).
  • Exceptional proficiency in Bash and Python; advanced scripting and automation skills are mandatory, not optional.
  • Proven ability to write maintainable automation and diagnostic tooling for large-scale systems.
  • Deep understanding of server hardware, storage subsystems, and datacenter operations.
  • Hands-on experience with virtualization platforms including Proxmox (current), VMware vSphere, and/or OpenShift.
  • Strong experience with containerization technologies (Docker, containerd) and orchestration platforms (Kubernetes).
  • Experience operating workloads in AWS and/or Microsoft Azure environments.
  • Experience implementing observability, monitoring, and alerting using tools such as Prometheus and Grafana.

Additional Qualifications:

  • Familiarity with systems programming languages such as C, C++, Rust, Go, and/or Julia.
  • Relevant certifications such as CompTIA A+, Azure Engineer, or similar are preferred.
  • Active government clearance or the ability to obtain one is preferred.

On-Call & Incident Response Expectations:

This role includes participation in an on-call rotation supporting developer and production systems. The SRE is expected to respond to incidents outside of normal business hours as required, lead technical incident response efforts, communicate effectively with stakeholders during outages, and produce clear post-incident documentation and corrective action plans.


Compensation Range: $145,000–$165,000 per year (California). This range represents the anticipated base pay for this role; the final offer may vary based on qualifications, experience, and location.


Benefits:

  • Medical, Dental, & Vision - 100% covered premiums
  • Equity - Stock Options
  • 401(k) match
  • WFH Hardware


Bolt is committed to building a diverse and inclusive environment in which we recognize and value each other’s differences as well as fostering a culture that promotes its core values: Professionalism, Integrity, and Respect. As an equal opportunity employer, all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, genetic information, national origin, age, disability, or status as a protected veteran.


Please note that Bolt Graphics does not currently sponsor candidates for this role. This role is strictly based in Sunnyvale, CA and will require someone to be locally based, preferably in the Immediate Bay Area.

Top Skills

AWS
Bash
Docker
Grafana
Kubernetes
Linux
Azure
Openshift
Prometheus
Proxmox
Python
Vmware Vsphere
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Sunnyvale, CA
59 Employees
Year Founded: 2020

What We Do

Bolt Graphics is a semiconductor startup building the fastest and most efficient graphics processors. Power consumption continues to rise with minimal performance improvements, resulting in increased cost and negative environmental impact.

Hardware hasn’t kept up with consumer expectations, especially in industries like architecture, engineering, film, advertising, gaming, & scientific research. Our customers came to us to solve their two main problems: performance and power consumption. At Bolt Graphics, we constantly ask ourselves, “Why is it done this way?” and “How can we do it better?” This mindset enables our team to achieve orders of magnitude faster renders and simulations and drives our goal for everyone to create, simulate, and consume immersive experiences.

Similar Jobs

General Motors Logo General Motors

Staff Engineer

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Hybrid
2 Locations
165000 Employees
184K-275K Annually

General Motors Logo General Motors

Site Reliability Engineer

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Hybrid
2 Locations
165000 Employees
202K-302K Annually

Zscaler Logo Zscaler

Site Reliability Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Hybrid
2 Locations
8697 Employees
119K-170K Annually

Mochi Health Logo Mochi Health

Site Reliability Engineer

Healthtech • Telehealth
Easy Apply
In-Office
San Francisco, CA, USA
70 Employees
250K-300K Annually

Similar Companies Hiring

True Anomaly Thumbnail
Software • Manufacturing • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Centennial, CO
250 Employees
Turion Space Thumbnail
Software • Manufacturing • Information Technology • Hardware • Defense • Artificial Intelligence • Aerospace
Irvine, CA
150 Employees
Fortune Brands Innovations Thumbnail
Manufacturing
Deerfield, IL
2450 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account