Principal Site Reliability Engineer

Posted 21 Days Ago
Hiring Remotely in U.S.
Remote
204K-275K Annually
7+ Years Experience
Information Technology • Security
The Role
Design and implement reliable, scalable, and highly available systems and infrastructure for cloud-based applications. Develop monitoring and alerting strategies, automation tools, and disaster recovery plans. Collaborate with teams to ensure proper testing and deployment of changes. Evaluate new technologies and provide technical leadership.
Summary Generated by Built In

Who is SimSpace:

SimSpace launched in 2015 with a singular purpose – addressing the most urgent and sophisticated cybersecurity challenges to reduce risk for our most vulnerable and valuable infrastructure. The organizations around the world that we depend on every day to keep our loved ones safe and secure. Our healthcare facilities, schools, financial institutions, transit centers, grocery stores, and workplaces just to name a few. To deliver global resiliency, we provide an elite cyber range platform to curate unassailable cyber defenses, data driven decisions, cutting edge training labs, live attack scenarios, and extensive individual and dynamic team readiness training. 

SimSpace works as OneTeam to elevate humanity around the world. We are committed to continuously improving and delivering a cultivated member experience whether that is accomplished through focusing on supporting our client’s teams or our own mission driven SimSpacers. 

We are an international company headquartered in Boston's Fort Point in the U.S. If you are interested in elevating the technology and creative solutions necessary to secure and safeguard our future while working alongside others who share your passion for purpose and development, we want to meet you!

Why should you choose a career at SimSpace?

We are an organization that is focused on building our culture and mindfully enhancing our atmosphere everyday which is why we have collaborated on an integral value system. Our governing philosophy of being Human Centered is deeply embedded within our value system. We apply this philosophy to every one of our internal team members, external clients, and their customers.

Our core values:

  • Serve to Protect – We provide safe space, deliver on the mission, and elevate humanity
  • Acquire Understanding – We seek and provide clarity 10x, cultivate comprehension, and believe information goes both all ways
  • Operate as Innovators – We stay curious, practice consistency over intensity, and continue to be the change we need in the world
  • Teamwork Without Borders – We are never alone, we solve for all, and keep people at the heart of everything we do

We are looking for: a Principal Site Reliability Engineer to design and implement reliable, scalable and highly available systems and infrastructure for our cloud-based applications.

What will you be doing as our Principal, Site Reliability Engineer: 

  • Develop and implement strategies for the monitoring and alerting of systems health, performance, and security
  • Develop and implement strategies for incident management, problem management, and change management
  • Create and maintain automation tools and scripts for configuration management, deployment, and maintenance of cloud-based infrastructure
  • Conduct performance and capacity planning to ensure the systems are operating at optimal levels
  • Implement and manage the disaster recovery plan, ensuring that the systems are backed up and can be recovered in case of an outage
  • Collaborate with development and operations teams to ensure that application and infrastructure changes are properly tested, deployed, and maintained
  • Evaluate new technologies and tools, and make recommendations for their adoption based on their impact on system performance, reliability, and scalability
  • Develop and maintain documentation of system configurations, processes, and procedures.
  • Providing technical leadership and mentoring to other engineers on the team

What are the qualifications to apply? To be successful as a Principal, Site Reliability Engineer you need to:

  • In depth experience in software development and/or infrastructure engineering, with a focus on site reliability and/or system administration
  • Strong experience in cloud computing, particularly with Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP)
  • Must have extensive experience in containerization technologies such as Docker and Kubernetes
  • Strong experience in one of the scripting languages such as Python, Perl, or Ruby.
  • Proficiency in Terraform or Cloud Formation for managing infrastructure using IAC principles
  • Proficiency in one of the configuration management tools such as Puppet, Chef, or Ansible
  • Deep understanding of networking concepts such as TCP/IP, DNS, load balancing, and firewalls

We provide the following:

  • Salary Range $204,000 - $275,000
  • Comprehensive benefits package that start on day one
  • 401k match with immediate vesting
  • Flex time, the time off you need when you need it
  • Equity options at hire and potential for additional based on performance
  • Generous employee referral bonus program
  • Peloton Interactive Wellness Program
  • LinkedIn Learning Membership
  • Monthly reimbursement for meaningful connections with other SimSpacers

SimSpace is an Equal Opportunity Employer:

In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification document form upon hire. 

SimSpace does not and shall not discriminate based on race, color, religion (creed), gender, gender expression and identity, age, national origin (ancestry), disability, marital status, sexual orientation, or military/veteran status, in any of its activities or operations. We are committed to providing an inclusive and welcoming environment for all members of our staff, clients, volunteers, subcontractors, vendors, and clients.

Research shows that women and people from underrepresented groups only apply to jobs if they meet all of the qualifications. However, no one ever meets 100% of the qualifications. SimSpace encourages you to break that statistic and to apply. We look forward to your application!

We also consider qualified applicants regardless of criminal histories, in accordance with applicable law. We are committed to providing reasonable accommodations for qualified individuals with disabilities in our job application procedures. If you need assistance or accommodation due to a disability, please contact [email protected].

SimSpace does not accept unsolicited resumes from employment agencies.

Actual compensation for the position is based on a variety of factors, including, but not limited to affordability, skills, qualifications and experience, and may vary from the range.

Top Skills

Python
The Company
Boston, MA
161 Employees
On-site Workplace
Year Founded: 2015

What We Do

Founded in 2015 by experts from the U.S. Cyber Command and MIT’s Lincoln Laboratory, SimSpace combines the highest-fidelity, military-grade cyber ranges and training content with unique user and adversary emulation techniques.

By providing team and individual training exercises, attack simulations, mission rehearsals, and product evaluations that leverage its cyber range, the SimSpace Cyber Force Platform delivers quantitative and actionable insights into how an organization can protect critical assets against cyber threats. SimSpace prepares individuals, teams and leaders for continued success against ever-evolving adversaries.
No other organization has SimSpace’s depth of experience in creating high fidelity cyber ranges with unique user and adversary emulation techniques.

These techniques are designed to stress people, process and technology across individual and team-level training exercises, attack simulations, mission rehearsals, and product evaluations. SimSpace's mission is to provide an automated, cost-effective evaluation method for calculating cyber risks based on realistic comprehensive assessments of holistic capability to yield more secure networks globally

Jobs at Similar Companies

MassMutual India Logo MassMutual India

Data Engineer

Big Data • Fintech • Information Technology • Insurance • Financial Services
Hyderabad, Telangana, IND

Halter Logo Halter

Experienced Mechanical Engineer

Hardware • Information Technology • Internet of Things • Machine Learning • Software • Business Intelligence • Agriculture
Easy Apply
Hybrid
Auckland, NZL
150 Employees

Silverfort Logo Silverfort

Head of Global Channel & Field Marketing

Information Technology • Sales • Security • Cybersecurity • Automation
Remote
United States
357 Employees

Similar Companies Hiring

Halter Thumbnail
Software • Machine Learning • Internet of Things • Information Technology • Hardware • Business Intelligence • Agriculture
Auckland City, NZ
150 Employees
MassMutual India Thumbnail
Insurance • Information Technology • Fintech • Financial Services • Big Data
Hyderabad, Telangana
Silverfort Thumbnail
Security • Sales • Information Technology • Cybersecurity • Automation
GB
357 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account