The Lead Site Reliability Engineer (Lead SRE) is responsible for driving reliability, scalability, and performance across Honeywell’s production systems. This role bridges software engineering and operations, ensuring that cloud‑native platforms and AI‑enabled services are resilient, secure, and cost‑optimized. The Lead SRE will mentor engineers, establish reliability best practices, and partner with product and engineering teams to embed observability, automation, and intelligent validation into every stage of the lifecycle.
Responsibilities- Reliability Strategy & Leadership: Define and enforce SRE standards, SLIs/SLOs, and error budgets across critical systems.
- Automation & Tooling: Build and scale automation frameworks for deployment, monitoring, and incident response.
- Cloud & Infrastructure: Lead design and optimization of hybrid cloud infrastructure (Azure, GCP) with a focus on resilience and cost efficiency.
- AI/ML Readiness: Partner with engineering teams to operationalize ML workloads, strengthen MLOps pipelines, and ensure reliability of AI‑driven services.
- Incident Management: Drive root cause analysis, postmortems, and continuous improvement for production incidents.
- Mentorship & Collaboration: Guide SRE and engineering teams, fostering a culture of ownership, learning, and proactive reliability practices.
- Governance & Security: Ensure compliance, observability, and responsible use of automation and AI in production systems.
- Education: Bachelor’s or Master’s in Computer Science, Engineering, or related field.
- Experience: 12+ years in software engineering or operations, with 3–5 years in SRE leadership. Proven experience managing large‑scale distributed systems and cloud infrastructure.
- Technical Skills:
- Expertise in cloud architecture, containers, Kubernetes, serverless patterns.
- Strong knowledge of observability stacks (Prometheus, Grafana, ELK, OpenTelemetry).
- Proficiency in automation and CI/CD tools (Terraform, Ansible, Jenkins, GitHub Actions).
- Familiarity with ML pipelines and MLOps tools (Azure ML, MLflow, Databricks).
- Programming skills in Python, or Go
- Leadership Skills: Ability to mentor engineers, influence cross‑functional partners, and drive reliability culture. Strong communicator with executive presence.
Skills Required
- Bachelor's or Master's in Computer Science, Engineering, or related field
- 12+ years in software engineering or operations
- 3-5 years in SRE leadership
- Expertise in cloud architecture and containers
- Strong knowledge of observability stacks
- Proficiency in automation and CI/CD tools
- Familiarity with ML pipelines and MLOps tools
- Programming skills in Python or Go
Honeywell Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Honeywell and has not been reviewed or approved by Honeywell.
-
Retirement Support — Retirement plans feature a notably strong company 401(k) match with vesting after three years, enhancing long-term savings security. Additional tax-advantaged accounts and company contributions for eligible earners further strengthen financial preparedness.
-
Leave & Time Off Breadth — Time off policies include flexible or unlimited vacation for many salaried roles and a broad observed-holiday schedule, providing manager-approved flexibility. This structure supports rest and work-life balance across varied needs.
-
Parental & Family Support — Parental leave offers paid time for birth, adoption, or foster care that can be taken consecutively or intermittently. The design enables practical flexibility in how family leave is used.
Honeywell Insights
What We Do
Honeywell is a Fortune 500 company that invents and manufactures technologies to address tough challenges linked to global macrotrends such as safety, security, and energy. With approximately 110,000 employees worldwide, including more than 19,000 engineers and scientists, we have an unrelenting focus on quality, delivery, value, and technology in everything we make and do.


.jpeg)





