Sr. Devops Engineer

Posted 9 Days Ago
Be an Early Applicant
Gurugram, Haryana
In-Office
Senior level
Artificial Intelligence • Software • Energy
The Role
The Sr. DevOps Engineer will design, deploy, automate, and manage AWS cloud systems, focusing on scalability, observability, and cost optimization while ensuring high availability and performance.
Summary Generated by Built In

About Stem - Driven by human and artificial intelligence – Stem is unlocking energy intelligence.

Stem is a global leader reimagining technology to support the energy transition. Turning complexity into clarity, and potential into performance.
We help asset owners, operators and stakeholders benefit from the full value of their energy portfolio by enabling the intelligent development, deployment, and operation of clean energy assets. Our integrated software suite, PowerTrack, is the industry standard and best-in-class for asset monitoring, supported by professional and managed services, under one roof. Meant to tackle challenges as seamlessly as possible, Stem shows the information needed clearly and accurately and helps harness raw data to inform actionable insight. With global projects managed in 55 countries – from Germany to Japan and across North America – customers have relied on Stem for nearly 20 years to maximize the value of their clean energy projects.

Stem’s culture embodies diversity & inclusion beyond the traditional facets of gender, ethnicity, age, disabilities, and sexual orientation to include experience, personality, communication, workstyles, and more. At our core, Stem is at the momentous intersection of clean energy and software technology where diverse ideas, experiences, and professional skills converge to make the inclusive culture we have today. Together, we are turning old school thoughts about software and energy into progressive, collaborative, and innovative solutions. By joining our team, you will be collaborating with data scientists, energy experts, skilled salespeople, thought-leading executives and more from a range of backgrounds. This intersection of ideas, beliefs, and skills is what makes us unique enough to lead the world’s largest network of digitally connected energy storage systems.

We are seeking DevOps Engineers who want to work on designing, building, and operating cloud-native, enterprise-level platforms connected to a large fleet of IoT devices. Our technology stack includes:

  • Languages/Frameworks: Python, Java, C#/.NET
  • Databases: DynamoDB, MySQL, MS-SQL, PostgreSQL, MongoDB, InfluxDB, TimescaleDB
  • Cloud Platform: AWS
  • Observability: Datadog, Grafana, Prometheus, OpenSearch, CloudWatch

Responsibilities

  • Design, deploy, automate, and manage AWS cloud-based production systems with IoT-connected devices, ensuring availability, performance, scalability, and security
  • Build and maintain comprehensive observability solutions including metrics, logs, and distributed tracing to provide full-stack visibility across applications and infrastructure
  • Design and implement alerting strategies that minimize noise, reduce alert fatigue, and enable rapid detection of production issues
  • Develop runbooks, automated remediation workflows, and self-healing infrastructure to reduce mean time to recovery (MTTR)
  • Analyze cloud spend and implement cost optimization strategies including right-sizing, Reserved Instances, Savings Plans, and resource lifecycle management
  • Build dashboards and reporting tools to provide visibility into infrastructure costs and enable teams to make data-driven decisions
  • Build and maintain self-service platforms through automation to increase developer productivity and assure product/service quality
  • Troubleshoot and solve problems across AWS infrastructure and application domains; lead incident response and conduct blameless post-mortems
  • Design durable and consistent patterns for distributed systems; recommend architecture and process improvements
  • Collaborate across multiple functional and technical teams to deliver projects on time and build enterprise-level platforms per the roadmap
  • Analyze and resolve complex infrastructure and application deployment issues
  • Evaluate emerging technology trends to enable evolving business and operating models
  • Facilitate the evaluation and selection of software products, services, and standards; design standard and custom software configurations
  • Assess existing platforms to identify deficiencies and improvements; recommend whether to maintain, refresh, or retire products, services, or systems
  • Ensure critical system security using industry-leading cloud security solutions

Requirements

  • 5+ years of overall experience, with 3+ years in enterprise environments
  • 3+ years building and managing cloud and IoT platforms supporting large, highly available, enterprise-grade applications
  • 4+ years working with AWS technologies (e.g., EC2, EKS, ECS, S3, Redshift, VPC, Glacier, IAM, CloudWatch, SQS, Lambda, CloudTrail, Systems Manager, KMS, Kinesis) with emphasis on the AWS Well-Architected Framework
  • Strong experience implementing observability solutions including metrics collection, centralized logging, and distributed tracing (e.g., OpenTelemetry, Jaeger, X-Ray)
  • Proven ability to design effective alerting systems with appropriate thresholds, escalation policies, and on-call rotations
  • Experience with incident management, root cause analysis, and building automated remediation workflows
  • Demonstrated track record of identifying and implementing AWS cost optimization strategies (right-sizing, Reserved Instances, Savings Plans, spot instances, resource scheduling)
  • Familiarity with AWS cost management tools (Cost Explorer, Budgets, Cost Allocation Tags, Compute Optimizer)
  • Strong Infrastructure-as-Code skills using tools such as Terraform, Ansible, Python, and Shell scripting
  • Hands-on experience with containerization and orchestration (e.g., Docker, Kubernetes, AWS EKS, ECS)
  • Solid experience in 24x7 production AWS environments, including CI/CD pipelines (Jenkins, GitLab CI, etc.)
  • Strong understanding of Site Reliability Engineering principles, SLOs/SLIs/SLAs, error budgets, and chaos engineering
  • Linux and Windows server administration
  • Experience with observability and monitoring platforms (e.g., Datadog, Grafana, Prometheus, OpenSearch/Elastic Stack, CloudWatch, PagerDuty)
  • Understanding of network topologies and protocols (DNS, HTTP/HTTPS, SSH, SFTP, SMTP)
  • Experience with IT compliance and risk management frameworks (e.g., NIST, SOC 2, SOX, FedRAMP)
  • Experience collaborating with client IT organizations to define appropriate solutions

Preferred Qualifications

  • AWS Solutions Architect Professional certification
  • CKA: Certified Kubernetes Administrator certification


Stem, Inc. is an equal opportunity employer committed to diversity in the workplace and does not discriminate against any employee or applicant for employment because of race, color, sex, pregnancy, religion, national origin, ethnicity, citizenship, sexual orientation, gender identity, age, marital status, disability, genetic information, military status, protected veteran status or any other factor protected by applicable federal, state or local laws.  

Top Skills

Ansible
AWS
C#/.Net
Cloudwatch
Datadog
Docker
DynamoDB
Gitlab Ci
Grafana
Influxdb
Java
Jenkins
Kubernetes
MongoDB
Ms-Sql
MySQL
Opensearch
Postgres
Prometheus
Python
Terraform
Timescaledb
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
504 Employees
Year Founded: 2009

What We Do

Stem, Inc. is a global leader in AI-enabled software and services that enable customers to plan, deploy, and operate clean energy assets. We offer a complete set of solutions that transform how solar and energy storage projects are developed, built, and operated, including an integrated suite of software and edge products, and full lifecycle services from a team of leading energy experts. More than 16,000 global customers rely on Stem to maximize the value of their clean energy projects and portfolios.

Similar Jobs

Optum Logo Optum

Senior Devops Engineer

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
2 Locations
160000 Employees

PAR Technology Logo PAR Technology

Senior Devops Engineer

Food • Software • Hospitality
In-Office
Gurugram, Haryana, IND
2000 Employees

Nexaminds Logo Nexaminds

Senior Databricks Engineer (Exp: 8-12 yrs | Databricks ,PySpark , Apache Spark, Python, SQL, Azure, Devops)

Artificial Intelligence • Cloud • Information Technology • Machine Learning • Natural Language Processing • Consulting
In-Office or Remote
4 Locations
80 Employees
Hybrid
Gurugram, Haryana, IND
2500 Employees

Similar Companies Hiring

Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account