Cloud SRE/DevOps Software Engineer (SD/DC/Remote) (R3138)

Job Posted 25 Days Ago Posted 25 Days Ago
Hiring Remotely in United States
Remote
119K-313K Annually
Senior level
Aerospace • Artificial Intelligence • Machine Learning • Robotics • Software
Our mission is to protect service members and civilians with intelligent systems.
The Role
As a Cloud SRE/DevOps Engineer at Shield AI, you will optimize cloud deployments for Forge, manage internal Hivemind instances, and support customer deployments. Your role includes enhancing scalability, developing deployment tools, and troubleshooting issues. You will also create user-friendly documentation and collaborate with various teams to ensure smooth operations and continuous upgrades.
Summary Generated by Built In

About Shield AI

Founded in 2015, Shield AI is a venture-backed defense technology company focused on protecting service members and civilians with intelligent systems. Its flagship autonomy software, Hivemind, powers aircraft, drones, and other platforms, enabling complex missions with high reliability in contested environments. With offices in San Diego, Dallas, Washington, D.C., and internationally, Shield AI’s products actively support U.S. and allied operations worldwide. For more information, visit www.shield.ai. Follow Shield AI on LinkedIn,Twitter, and Instagram.


As a Cloud SRE/DevOps Engineer on the Forge team, you will be responsible for optimizing Forge’s cloud deployments and owning the processes that enable customers to deploy their own Forge instances. You will manage Shield AI’s internal Hivemind instances, working closely with the software operations and engineering teams to ensure Forge can scale for simulation, testing, and bursts of use. You will also enable seamless upgrades, canary deployments, and system robustness. Additionally, you’ll serve as the primary point of contact for the customer engagement team, providing expert guidance on deploying Forge in customer environments. 

What You'll Do:

  • Optimize cloud deployments of Forge to ensure scalability, reliability, and cost efficiency. 
  • Design and document processes for external customers to deploy Forge instances using the SDK in on-premises or hybrid environments. 
  • Manage and maintain internal Hivemind instances, ensuring their ability to handle large-scale simulation and testing workloads. 
  • Collaborate with the software operations team to enhance Forge’s ability to scale dynamically, accommodate bursts of use, and support continuous upgrades with minimal disruption. 
  • Develop tools and processes for canary deployments, ensuring smooth rollouts of new features and updates. 
  • Serve as the primary technical consultant for the customer engagement team, providing expertise on deploying and managing Forge in external environments. 
  • Create and maintain detailed, user-friendly documentation and tutorials for deployment processes, catering to both internal teams and external customers. 
  • Monitor, troubleshoot, and resolve issues related to Forge deployments, ensuring high availability and performance. 

Required Qualifications:

  • Typically requires a minimum of 5 - 15 years of related experience with a Bachelor’s degree
  • 2 - 10+ years of experience in DevOps, Site Reliability Engineering, or cloud infrastructure roles. 
  • Expertise in cloud platforms such as AWS, Azure, or GCP, including deploying and managing scalable, distributed systems. 
  • Strong experience with Kubernetes and containerization. 
  • Experience creating Helm charts. 
  • Solid understanding of infrastructure-as-code tools like Terraform, CloudFormation, or similar. 
  • Proficiency in scripting and programming languages such as Python, Golang, or Bash. 
  • Demonstrated experience optimizing CI/CD pipelines, implementing canary deployments, or tools like ArgoCD and FluxCD. 
  • Familiarity with networking concepts and protocols, as well as system monitoring tools (e.g., Prometheus, Grafana). 
  • Experience deploying and configuring databases such as Postgres.
  • Excellent technical writing skills, with a proven ability to create clear, comprehensive documentation and tutorials. 
  • BS/MS in Computer Science, Engineering, or equivalent practical experience. 
  • Ability to work cross-functionally and communicate effectively with engineering, operations, and customer-facing teams. 

Preferred Qualifications:

  • Experience with secure software deployments in regulated industries such as aerospace, defense, or finance. 
  • Systems software development experience using programming languages like C++, Rust or Golang. 
  • Experience building software development kits or productized tools for deploying cloud systems. 
  • Knowledge of hybrid and on-premises deployment strategies and challenges. 
  • Hands-on experience with database performance optimization and scaling strategies. 
  • Familiarity with configuration management tools like Ansible, Chef, or Puppet. 
  • Experience building robust monitoring and alerting systems for mission-critical applications. 
  • Background in managing high-throughput simulation or testing environments. 
  • Experience optimizing databases

#LI-LD1

#LF


Total package details for U.S. based positions:

- Regular employee positions: Salary within range listed above + Bonus + Benefits + Equity

- Temporary employee positions: Hourly within range listed above + temporary benefits package (applicable after 60 days

of employment)

- Interns/Military Fellows/Part-time not eligible for bonus, benefits or equity


Total package details for International positions which are roles based outside of the United States (where applicable):

- International premium, hardship differential, cost of living differential, living quarters allowance, foreign service transfer

allowance, equity, international benefits, visa assistance, and relocation assistance.


Salary compensation is influenced by a wide array of factors including but not limited to skill set, level of experience, licenses and certifications, and specific work location. All offers are contingent on a cleared background and possible reference check.


Shield AI is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, gender identity or Veteran status. If you have a disability or special need that requires accommodation, please let us know. 

Top Skills

Bash
C++
Go
Python
Rust
The Company
HQ: San Diego, CA
750 Employees
Hybrid Workplace
Year Founded: 2015

What We Do

Shield AI is building the world's best AI pilot. Our Hivemind autonomy stack is the first and only autonomous AI Pilot, deployed in combat since 2018. Hivemind enables intelligent teams of aircraft to perform missions ranging from room clearance, to penetrating air defense systems, and dogfighting F-16s.

Why Work With Us

What makes Shield AI special is our people. We unlock the power of autonomy, and in the face of overwhelming odds and challenges, we find ways to win and make a difference for our customers. We bring together software, AI, and aerospace engineering disciplines to deploy the most intelligent aviation capabilities in the world.

Gallery

Gallery

Similar Jobs

CDW Logo CDW

DevOps Engineer

Artificial Intelligence • eCommerce • Information Technology • Internet of Things • Automation
Remote
IL, USA
15100 Employees
109K-155K Annually

CrowdStrike Logo CrowdStrike

DevOps Engineer III - LogScale (Remote)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote
Hybrid
USA
10000 Employees
110K-180K Annually

Two Barrels LLC Logo Two Barrels LLC

Senior DevOps Voip Engineer

eCommerce • Legal Tech • Professional Services • Software • Data Privacy
Remote
Hybrid
Brazos Country, TX, USA
950 Employees
150K-150K Annually

CrowdStrike Logo CrowdStrike

Sr. DevOps Engineer - LogScale (Remote)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote
USA
10000 Employees
135K-215K Annually

Similar Companies Hiring

HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Red 6 Thumbnail
Defense • Aerospace
Orlando, Florida
113 Employees
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account