Site Reliability Engineer

Posted 17 Days Ago
Hiring Remotely in Houston, TX
Remote
Senior level
Artificial Intelligence • Machine Learning • Energy
The Role
The Site Reliability Engineer will design and support Imubit's cloud infrastructure to ensure high uptime and security. Responsibilities include optimizing deployment processes, incident management, automating systems, and collaborating with software developers to enhance infrastructure resilience and scalability.
Summary Generated by Built In

TL;DR:

Imubit is looking for a Site Reliability Engineer to help disrupt the refining and chemical industries with breakthrough machine learning technologies.


About us:

Imubit directly controls and optimizes refineries and chemical plants with AI to add millions of dollars to the plant bottom line while managing safe operating limits, energy efficiency, and sustainability objectives. Imubit’s Closed Loop Neural Network platform allows customers to leverage an advanced form of AI called Reinforcement Learning (RL). Through our patented approach to apply RL for industrial processes, industry leaders have been able to fundamentally change the way they optimize their plants and improve profitability in real-time. Imubit’s solution is currently optimizing the manufacturing facilities of Fortune-500 companies. Imubit has combined the industry expertise from companies like Exxon and Shell with award-winning data scientists endorsed by Google. Imubit is backed by tier-1 venture capital firms such as Insight Partners.


We are looking for:

You, a top-notch Site Reliability Engineer, who will design and support Imubit’s cloud infrastructure. As part of this, you will work to optimize deployment processes and keep systems running. You will work with a variety of cloud technologies, automation, and infrastructure-as-code. Additionally, our SREs keep an ever-watchful eye on our systems capacity and performance. Much of our time is spent optimizing existing systems, building infrastructure and reducing repetitive work through automation.

You will also play a critical role in incident management, swiftly identifying and resolving issues to minimize downtime and ensure seamless operations. Collaboration is key in this role, as you will work closely with software developers, DevOps engineers, and other stakeholders to implement robust solutions and drive continuous improvement. As a proactive member of our team, you will stay updated with the latest industry trends and best practices, applying this knowledge to enhance our infrastructure's resilience and scalability. Your contributions will directly impact the reliability and efficiency of our services, making you an integral part of our success.


In this position, you will:

  • Design, deploy and maintain Imubit’s cloud infrastructure to provide high uptime, scalability and security.
  • Leverage public cloud services and tools to improve efficiency and reliability of our services and workflows.
  • Architect and manage cross-cloud network infrastructure (e.g. subnets, routing tables, IPSec VPNs, Transit Gateways, firewall rules).
  • Engage in and improve the whole lifecycle of services, from inception and design, through deployment, operation and refinement.
  • Participate in infrastructure on-call rotation and respond in a timely manner.
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.

Minimum Qualifications: 

  • 5 years experience maintaining production level cloud infrastructure, including public cloud services (e.g. AWS, GCP).
  • Preferred BA/B.Sc. in Computer Science or equivalent
  • Experience with a programming language such as Python or Go.
  • Experience deploying and supporting services in Kubernetes, including GitOps management tools such as ArgoCD.
  • Familiarity with software development principles/concepts (e.g. Version control (Git), software development lifecycle).
  • Experience implementing and utilizing monitoring tools (e.g New Relic, Splunk, Grafana, Prometheus).
  • Experience managing production databases (e.g. PostgreSQL), including managed services (e.g. AWS RDS).
  • Experience with Infrastructure-as-code concepts and tools (e.g. Terraform, Ansible)
  • Experience with secrets management tools (e.g. HashiCorp Vault, AWS Secrets Manager)
  • Interest in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Ability to debug and optimize code and automate routine tasks.
  • Systematic problem-solving approach, coupled with effective communication skills and a sense of ownership and drive.

Imubit provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability or genetics. In addition to federal law requirements, Imubit complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.


Imubit does not accept or retain unsolicited CVs or phone calls and/or respond to them or to any third party representing job seekers.


No visa sponsorship is available for this position.


[email protected]

Top Skills

Go
Python
The Company
HQ: Houston, TX
137 Employees
On-site Workplace
Year Founded: 2016

What We Do

At Imubit, we’re driven by a mission to tackle and solve the toughest challenges at chemical plants and refineries.

Our solution - the Imubit Closed Loop Neural Network™ - is an AI process optimization technology that enables plant managers to discover, engineer and monetize process optimization opportunities considered impossible until now.

As the first-ever technology of its kind in the industry, our neural network solution interconnects planning and economics, process engineering, process control and operations, in using historical data to learn subtle nonlinear dynamics, and optimize economically critical plant parameters in real time. This end-to-end solution is currently running in the world’s largest refining and petrochemical plants, unlocking millions of dollars a year in annual margin for operators.

Imubit is led by globally-renowned, Google-endorsed machine learning scientists and world-class hydrocarbon processing experts with over 220 years of industry experience.

Similar Jobs

Cloudflare Logo Cloudflare

Systems Reliability Engineer SRE, Edge Platform

Cloud • Information Technology • Security • Software • Cybersecurity
Remote
Hybrid
Austin, TX, USA
3900 Employees

CrowdStrike Logo CrowdStrike

Site Reliability Engineer (Remote)

Cloud • Information Technology • Sales • Security • Cybersecurity
Remote
United States
10000 Employees
115K-180K Annually

Capital One Logo Capital One

Director, Technical Program Management - SRE (Remote Eligible)

Fintech • Machine Learning • Payments • Software • Financial Services
Remote
Hybrid
Plano, TX, USA
55000 Employees
198K-225K Annually

AMP Logo AMP

Site Reliability Engineer - Embedded

Artificial Intelligence • Computer Vision • Greentech • Machine Learning • Robotics • Industrial • Automation
Easy Apply
Remote
United States
130 Employees

Similar Companies Hiring

Halter Thumbnail
Software • Machine Learning • Internet of Things • Hardware • Greentech • Business Intelligence • Agriculture
Auckland City, NZ
150 Employees
Energy CX Thumbnail
Utilities • Professional Services • Greentech • Financial Services • Energy • Consulting • Business Intelligence
Chicago, IL
55 Employees
InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account