Site Reliability Engineer (Remote)

Posted 5 Days Ago
Hiring Remotely in United States
Remote
Senior level
Aerospace • Software
The Role
The Site Reliability Engineer will build and support complex infrastructure and deployment scenarios, contributing to the software lifecycle from design to operation. Responsibilities include improving system reliability, automation, incident response, and supporting QA and security audits, while collaborating with development and customer teams.
Summary Generated by Built In

Epsilon3 is a multi-product operations management platform revolutionizing the way teams build, launch, and operate spacecraft and other advanced hardware systems.


Launched in 2021, our company is led by engineers from SpaceX, Google, and NASA, who have experience supporting over 100 space missions. Innovative teams at Blue Origin, Rocket Lab, Axiom Space, Firefly Aerospace, and many others depend on our web-based (SaaS) solutions to plan and track high-stakes procedures. We raised a $15M Series A funding round led by Lux Capital, Y Combinator (YC S21), and other world-class investors.


This role is remote and can be based anywhere in the United States.


We are looking for a Site Reliability Engineer (SRE) who is interested in space exploration and passionate about building scalable, reliable, and secure software. You will be responsible for building and supporting complex infrastructure and deployment scenarios. We are currently using technologies such as React.JS, Node, Postgres, AWS GovCloud, Docker, and K8s, and our stack will evolve over time as we scale our solutions and approach.


The ideal candidate has years of experience using Kubernetes (K8s) and is proficient in JavaScript.

Some of the technical challenges we’re undertaking:

  • Real-time synchronization of data and user interfaces across earth and space
  • Visualization of many complex data fields
  • Integration of multiple high-bandwidth data streams for real-time processing and display
  • Multiple deployment environments including cloud and on-premises
  • Mission-critical security and reliability requirements
  • Supporting complex workflows and detailed tracking while also maintaining simplicity and delightfulness of user experience

Responsibilities:

  • Support and contribute to the entire lifecycle of our software, from inception and design, through to deployment, operation and refinement
  • Support our services in production and before they go live through system design, security considerations, capacity planning, and launch preparedness
  • Build processes and systems to continuously improve system reliability and performance
  • Build processes and systems to continuously improve the productivity of the rest of the development team
  • Scale systems sustainably through automation and continuous improvement in reliability and velocity
  • Practice sustainable incident response and postmortems
  • Contribute to the design, build, test, and release of our web-based operational dashboards, electronic procedure tools, and suite of specialized software solutions to support various missions
  • Join and actively participate in customer discovery calls and technical demonstrations
  • Analyze and enhance the security, efficiency, stability, and scalability of our software systems
  • Support software QA and user testing
  • Support and facilitate security reviews and audits of our systems by customers and third parties
  • Facilitate compliance with cybersecurity certifications and contribute to improvements in our security policies and processes
  • Assess third-party and open source software and develop integrations
  • Contribute to the growth and refinement of our engineering culture, processes, and tools

Qualifications:

  • Bachelor’s Degree in Computer Science or related field
  • 5+ years of combined experience in site reliability and production software engineering
  • Proficiency with Kubernetes (K8s) and JavaScript (JS) is required for this role
  • Strong foundation in computer science concepts (algorithms, data structures, object-oriented programming, design, testing, etc.)
  • Self-starter and able to navigate ambiguity and assess rapidly evolving priorities
  • Strong team player with great communication skills and collaborative work ethic
  • Love of learning (technical and otherwise)
  • Experience in fast-growing tech startups is a plus
  • Experience with Lean Startup methodologies (agile software development) is a plus
  • US Citizenship (future security clearance may be required)
  • Must be located in the United States

Salary range: $120,000 - $175,000


This full-time role includes stock options, generous PTO, health insurance, and a 4% 401k match.


We meet in-person four times per year for hackathons and fun team bonding activities.


Epsilon3 is an equal opportunity employer committed to diversity and inclusion in the workplace. We prohibit discrimination and harassment of any kind based on race, color, sex, religion, sexual orientation, national origin, disability, genetic information, pregnancy, or any other protected characteristic as outlined by federal, state, or local laws. This policy applies to all employment practices within our organization, including hiring, recruiting, promotion, termination, layoff, recall, leave of absence, compensation, benefits, training, and apprenticeship. Epsilon3 makes hiring decisions based solely on qualifications, merit, and business needs at the time.


Epsilon3 Newsletter | LinkedIn | YouTube | Instagram | X

Top Skills

JavaScript
The Company
HQ: Los Angeles, CA
26 Employees
On-site Workplace
Year Founded: 2021

What We Do

Epsilon3 is the OS for spacecraft and complex operations.

Epsilon3’s software platform manages complex operational procedures, saving operators time and reducing errors.

If you’re running complex, high-stakes operations, Epsilon3 is for you.

Epsilon3 is funded by Y Combinator and other world-class investors.

Similar Jobs

Cisco Meraki Logo Cisco Meraki

Site Reliability Engineer, FedRamp, Remote in the U.S.

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
United States
3000 Employees
95K-153K Annually

DFIN Logo DFIN

Senior Site Reliability Engineer - Database (Remote)

Artificial Intelligence • Fintech • Information Technology • Software • Data Privacy
Remote
United States
2600 Employees

Cisco Meraki Logo Cisco Meraki

Senior Site Reliability Engineer, Cloud Foundations - REMOTE

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
United States
3000 Employees
126K-185K Annually

Cisco Meraki Logo Cisco Meraki

Senior Site Reliability Engineer, Engineering Enablement - REMOTE

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
United States
3000 Employees
126K-185K Annually

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account