Platform Operations Engineer (Site Reliability Engineer)

Posted 10 Days Ago
Be an Early Applicant
Westerville, OH, USA
In-Office
Senior level
Hardware • Software • Analytics
The Role
Owner of cross-platform observability and incident management for Vertiv Digital platforms. Design and operate monitoring, SLOs/SLIs, incident response, SLA governance, capacity planning, automation to reduce toil, CI/CD reliability, and enforce DevSecOps and operational governance across cloud and containerized environments.
Summary Generated by Built In

Job Summary

Vertiv is seeking a skilled Platform Operations Engineer (Site Reliability Engineer) to serve as the owner of cross-platform observability, incident management, and operational reliability within Vertiv’s Digital organization. This individual contributor role is responsible for designing, implementing, and continuously improving monitoring and alerting solutions across Vertiv’s digital platform ecosystem — including Compass AI, Writer AI, Site Scope, UiPath, Workato, Cursor, and other approved enterprise tools — while owning incident response processes, SLA management, and operational governance. The Platform Operations / SRE will operate within the Digital organization and play a central role in advancing Vertiv’s Operational Excellence strategic priority by ensuring the availability, performance, and resilience of platforms that power critical digital workflows and business functions.

As an individual contributor in a lead capacity, this role includes proactive reliability engineering — applying SRE principles such as SLOs, error budgets, and blameless post-mortems — and embedding secure coding and operational governance practices across the Digital organization. The Platform Operations / SRE Engineer will define and enforce observability standards, lead incident response and root cause analysis, manage platform-level SLAs, and partner with engineering, security, and business stakeholders to ensure that all digital platforms meet agreed availability and performance targets. 

This position partners closely with IT Security, NPDI, Digital delivery teams, and business operations, and is based on site at Vertiv’s Westerville, OH headquarters.

Responsibilities

  • Own Cross-Platform Monitoring & Observability: Design, implement, and maintain end-to-end monitoring, alerting, and observability solutions across Vertiv’s digital platform ecosystem — including AI platforms, automation tools, and internal applications — ensuring real-time visibility into system health, performance, and availability.
  • Lead Incident Response & Management: Serve as the primary escalation point and incident commander for P1/P2 incidents across Digital platforms; lead root cause analysis (RCA), blameless post-mortems, and corrective action tracking to prevent recurrence and reduce mean time to resolution (MTTR).
  • Manage Platform SLAs & Reliability Targets: Define, instrument, and enforce service level objectives (SLOs), service level indicators (SLIs), and error budgets across Digital platforms; produce regular SLA performance reports for leadership and drive platform improvements to meet or exceed agreed availability and performance targets.
  • Drive Secure Coding & Operational Governance: Champion secure coding practices and DevSecOps standards within Digital delivery teams; conduct operational readiness reviews for new platform deployments, enforce configuration management and change control processes, and partner with IT Security and NPDI to ensure all platforms meet Vertiv’s security and compliance requirements.
  • Automate Operations & Reduce Toil: Identify and eliminate manual operational toil through automation. This includes automated remediation runbooks and anomaly detection through the use of scripting, IaC tools, and approved automation platforms.
  • Capacity Planning & Performance Engineering: Analyze platform utilization trends and conduct capacity planning across Digital environments; proactively identify performance bottlenecks and recommend architectural improvements to ensure platforms scale reliably with business demand.
  • CI/CD Pipeline Reliability & Deployment Support: Partner with Digital delivery teams to ensure CI/CD pipelines are instrumented for reliability, deployment risk is managed through progressive rollout strategies, and production deployments are supported with appropriate rollback and health-check capabilities.
  • Evaluate & Advance Observability Tooling: Stay current on advancements in observability, AIOps, and SRE tooling; evaluate and recommend new tools and practices that enhance Vertiv’s platform operations maturity, and drive adoption of modern reliability engineering standards across the Digital organization.

Requirements

  • Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related field; equivalent practical experience considered.
  • 5+ years of professional experience in platform operations, site reliability engineering, DevOps, or a related software/infrastructure engineering discipline.
  • 3+ years of hands-on experience with enterprise monitoring and observability platforms (e.g., Datadog, Grafana, Prometheus, Azure Monitor, Splunk, or equivalent) in a multi-platform environment.
  • Demonstrated experience owning and managing incident response processes, post-mortem facilitation, and SLA/SLO governance.
  • Experience implementing secure coding practices, DevSecOps standards, or operational governance frameworks in an enterprise software delivery environment.

Technical Skills

  • Proficiency with monitoring and observability tools (Datadog, Grafana, Prometheus, Azure Monitor, Splunk, or equivalent) for cross-platform health and performance tracking.
  • Strong knowledge of SRE principles, including SLOs, SLIs, blameless post-mortems, and toil reduction practices.
  • Hands-on experience with cloud platforms (AWS preferred) and familiarity with containerized environments (Docker, Kubernetes) and infrastructure-as-code tooling (Terraform, Ansible, or equivalent).
  • Proficiency in at multiple programming languages (Python, Ruby, Powershell, Java, Javascript, C#, etc.) for automation and runbook development.
  • Experience with CI/CD platforms (GitLab, Jenkins, GitHub Actions, Azure DevOps, or equivalent) and deployment reliability practices including progressive rollout, feature flags, and automated health checks.

Preferred Qualifications

  • Google SRE certification, AWS DevOps Professional, Azure certifications, or equivalent SRE/cloud operations certification.
  • Experience with AIOps tooling or AI-assisted anomaly detection and automated remediation capabilities.
  • Familiarity with the Vertiv digital platform ecosystem: Workato, UiPath, Power Automate, Compass AI, Writer AI, or Cursor.
  • Experience applying DevSecOps practices, including SAST/DAST scanning, secrets management, and compliance-as-code in enterprise environments.
  • Experience working in Agile/Scrum delivery environments; familiarity with ITIL incident and change management frameworks.



The successful candidate will embrace Vertiv’s Core Principals & Behaviors to help execute our Strategic Priorities. 

OUR CORE PRINCIPALSSafety.  Integrity.  Respect.  Teamwork.  Diversity & Inclusion.

OUR STRATEGIC PRIORITIES

•  Customer Focus

•  Operational Excellence

•  High-Performance Culture

•  Innovation

•  Financial Strength

OUR BEHAVIORS

•  Own It

•  Act With Urgency

•  Foster a Customer-First Mindset

•  Think Big and Execute

•  Lead by Example

•  Drive Continuous Improvement

•  Learn and Seek Out Development


About Vertiv

Vertiv is a $10.2 billion global critical infrastructure and data center technology company.  We ensure customers’ vital applications run continuously by bringing together hardware, software, analytics and ongoing services.  Our portfolio includes power, cooling and IT infrastructure solutions and services that extends from the cloud to the edge of the network. Headquartered in Columbus, Ohio, USA, Vertiv employs around 20,000 people and does business in more than 130 countries. Visit Vertiv.com to learn more.   

Work Authorization

No calls or agencies please. Vertiv will only employ those who are legally authorized to work in the United States. This is not a position for which sponsorship will be provided. Individuals with temporary visas such as E, F-1, H-1, H-2, L, B, J, or TN or who need sponsorship for work authorization now or in the future, are not eligible for hire.

Equal Opportunity Employer

Vertiv is an Equal Opportunity/Affirmative Action employer. We promote equal opportunities for all with respect to hiring, terms of employment, mobility, training, compensation, and occupational health, without discrimination as to age, race, color, religion, creed, sex, pregnancy status (including childbirth, breastfeeding, or related medical conditions), marital status, sexual orientation, gender identity / expression (including transgender status or sexual stereotypes), genetic information, citizenship status, national origin, protected veteran status, political affiliation, or disability. If you have a disability and are having difficulty accessing or using this website to apply for a position, you can request help by sending an email to [email protected].

#LI-RB1

About the Team
Work Authorization

No calls or agencies please. Vertiv will only employ those who are legally authorized to work in the United States. This is not a position for which sponsorship will be provided. Individuals with temporary visas such as E, F-1, H-1, H-2, L, B, J, or TN or who need sponsorship for work authorization now or in the future, are not eligible for hire.

Equal Opportunity Employer

We promote equal opportunities for all with respect to hiring, terms of employment, mobility, training, compensation, and occupational health, without discrimination as to age, race, color, religion, creed, sex, pregnancy status (including childbirth, breastfeeding, or related medical conditions), marital status, sexual orientation, gender identity / expression (including transgender status or sexual stereotypes), genetic information, citizenship status, national origin, protected veteran status, political affiliation, or disability.

Skills Required

  • Bachelor's degree in Computer Science, Information Systems, Engineering, or related field (or equivalent experience)
  • 5+ years professional experience in platform operations, SRE, DevOps, or related software/infrastructure engineering
  • 3+ years hands-on experience with enterprise monitoring and observability platforms (Datadog, Grafana, Prometheus, Azure Monitor, Splunk, or equivalent)
  • Experience owning and managing incident response processes, post-mortem facilitation, and SLA/SLO governance
  • Experience implementing secure coding practices, DevSecOps standards, or operational governance frameworks
  • Hands-on experience with cloud platforms (AWS preferred) and containerized environments (Docker, Kubernetes)
  • Experience with infrastructure-as-code tooling (Terraform, Ansible, or equivalent)
  • Proficiency in multiple programming/scripting languages for automation and runbook development (Python, Ruby, PowerShell, Java, JavaScript, C#)
  • Experience with CI/CD platforms and deployment reliability practices (GitLab, Jenkins, GitHub Actions, Azure DevOps, progressive rollout, feature flags)
  • Google SRE certification, AWS DevOps Professional, Azure certifications, or equivalent SRE/cloud operations certification
  • Experience with AIOps tooling or AI-assisted anomaly detection and automated remediation
  • Familiarity with Vertiv digital platform ecosystem: Workato, UiPath, Power Automate, Compass AI, Writer AI, or Cursor
  • Experience applying DevSecOps practices including SAST/DAST scanning, secrets management, and compliance-as-code
  • Experience working in Agile/Scrum environments and familiarity with ITIL incident and change management
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Columbus, OH
8,435 Employees
Year Founded: 2016

What We Do

Vertiv (NYSE: VRT) brings together hardware, software, analytics and ongoing services to ensure its customers’ vital applications run continuously, perform optimally and grow with their business needs. Vertiv solves the most important challenges facing today’s data centers, communication networks and commercial and industrial facilities with a portfolio of power, cooling and IT infrastructure solutions and services that extends from the cloud to the edge of the network. Headquartered in Columbus, Ohio, USA, Vertiv employs approximately 20,000 people and does business in more than 130 countries. For more information, and for the latest news and content from Vertiv, visit Vertiv.com.

Similar Jobs

People Inc. Logo People Inc.

Senior Software Engineer

AdTech • Consumer Web • Digital Media • eCommerce • Marketing Tech
Remote or Hybrid
US
3500 Employees
160K-195K Annually

SailPoint Logo SailPoint

Answer Engine Optimization (AEO/GEO) Manager

Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Remote or Hybrid
2 Locations
2461 Employees
101K-171K Annually

Inspiren Logo Inspiren

VP of Quality

Artificial Intelligence • Hardware • Healthtech • Software
Easy Apply
In-Office or Remote
3 Locations
150 Employees
260K-300K Annually

Wells Fargo Logo Wells Fargo

Software Engineer

Fintech • Financial Services
Hybrid
6 Locations
205000 Employees
119K-224K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account