SRE Manager

Reposted 4 Days Ago
Be an Early Applicant
Dallas, TX
In-Office
150K-206K Annually
Senior level
Digital Media • Software • Analytics
The Role
As the SRE Manager, you will lead a global team ensuring system performance, reliability, and scalability while driving improvements in DevOps practices and cross-functional collaboration.
Summary Generated by Built In

Transforming the Future of Enterprise Planning

At o9, our mission is to be the Most Value-Creating Platform for enterprises by transforming decision-making through our AI-first approach. By integrating siloed planning capabilities and capturing millions—even billions—in value leakage, we help businesses plan smarter and faster.

This not only enhances operational efficiency but also reduces waste, leading to better outcomes for both businesses and the planet. Global leaders like Google, PepsiCo, Walmart, T-Mobile, AB InBev, and Starbucks trust o9 to optimize their supply chains.

Transforming the Future of Enterprise Planning

At o9, our mission is to be the Most Value-Creating Platform for enterprises by transforming decision-making through our AI-first approach. By integrating siloed planning capabilities and capturing millions—even billions—in value leakage, we help businesses plan smarter and faster.

This not only enhances operational efficiency but also reduces waste, leading to better outcomes for both businesses and the planet. Global leaders like Google, PepsiCo, Walmart, T-Mobile, AB InBev, and Starbucks trust o9 to optimize their supply chains.

Senior Site Reliability Engineering Manager

At o9, we invest in people. We seek talented, driven individuals to power our transformative approach. You’ll thrive in a dynamic, supportive environment, growing while making a real impact.

As the Site Reliability Engineering Manager at o9 Solutions, you will lead a high-performing global team responsible for ensuring the availability, performance, and scalability of our SaaS-based supply chain and retail planning platform. This leadership role requires a strategic thinker with strong technical acumen and a passion for building resilient systems. You will partner closely with cross-functional teams to drive improvements in reliability, system architecture, incident management, and DevOps best practices.

What you’ll do for us

Team Leadership & Development

  • Hire, mentor, and manage a globally distributed team of Site Reliability Engineers.
  • Foster a culture of continuous improvement, accountability, collaboration, and operational excellence.
  • Set performance goals, conduct regular feedback sessions, and support career growth for team members.

Reliability & Performance Management

  • Own system uptime and SLA compliance across o9’s cloud-native production environment.
  • Drive root cause analysis and implement post-incident learning processes to improve system resilience.
  • Oversee the design and implementation of robust monitoring, alerting, and logging solutions.

Operational Strategy & Automation

  • Lead initiatives to improve infrastructure automation, deployment pipelines, and CI/CD practices.
  • Champion Infrastructure as Code (IaC) and GitOps best practices.
  • Manage capacity planning, scalability efforts, and performance tuning across services.

Cross-functional Collaboration

  • Work closely with Engineering, QA, Product, and Customer Support teams to embed reliability into every stage of the software lifecycle.
  • Advocate for SRE principles in system design, ensuring high availability and fault tolerance.
  • Collaborate with cloud service providers (AWS, Azure, GCP) to optimize performance and cost.

Incident Response & Support

  • Oversee 24/7 on-call rotations and ensure timely response to production incidents.
  • Implement and refine incident management processes and playbooks.
    Communicate effectively with stakeholders during and after major incidents.

What you’ll have

Education & Experience

  • Bachelor’s degree in Computer Science, Engineering, or a related field required; Master’s degree preferred.
  • 8+ years of experience in DevOps, SRE, or infrastructure roles, with 2+ years leading or managing technical teams.
  • Experience operating complex, cloud-native production systems at scale.

Certifications

  • Relevant cloud certifications (AWS, Azure, or GCP) strongly preferred.
  • Kubernetes Administration (CKA) certification is a plus.

Technical Proficiency

  • Strong knowledge of cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes).
  • Expertise in observability tools (Prometheus, Grafana, Datadog, etc.) and incident management platforms.
  • Experience with configuration management tools (Terraform, Ansible, Helm, etc.).
  • Solid understanding of networking, security, Linux internals, and distributed systems.

Soft Skills

  • Proven ability to lead technical teams through high-stakes, high-impact situations.
  • Strong communication skills with the ability to translate complex topics into clear stakeholder updates.

Strategic mindset with a bias for action and problem-solving.

This position at o9 Solutions has an annual salary range of $149,818-$205,999. Additionally, you may be eligible to participate in our medical, retirement, and other company-sponsored benefits.**The above information reflects the expected base salary range, although the lower and upper bounds may vary based on location, skills, experience, certifications, licenses, or other relevant factors.

More about us… 

At o9, transparency and open communication are at the core of our culture. Collaboration thrives across all levels—hierarchy, distance, or function never limit innovation or teamwork. Beyond work, we encourage volunteering opportunities, social impact initiatives, and diverse cultural celebrations.

With a $3.7 billion valuation and a global presence across Dallas, Amsterdam, Barcelona, Madrid, London, Paris, Tokyo, Seoul, and Munich, o9 is among the fastest-growing technology companies in the world. Through our aim10x vision, we are committed to AI-powered management, driving 10x improvements in enterprise decision-making. Our Enterprise Knowledge Graph enables businesses to anticipate risks, adapt to market shifts, and gain real-time visibility. By automating millions of decisions and reducing manual interventions by up to 90%, we empower enterprises to drive profitable growth, reduce inefficiencies, and create lasting value.

o9 is an equal-opportunity employer that values diversity and inclusion. We welcome applicants from all backgrounds, ensuring a fair and unbiased hiring process. Join us as we continue our growth journey!

Top Skills

Ansible
AWS
Azure
Datadog
GCP
Grafana
Helm
Kubernetes
Prometheus
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Dallas, TX
920 Employees

What We Do

O9 Solutions is a cloud-based business management platform powering digital transformations of integrated planning and operations.

Similar Jobs

Toast Logo Toast

Account Executive

Cloud • Fintech • Food • Information Technology • Software • Hospitality
In-Office
Dallas, TX, USA
5000 Employees
118K-189K Annually

Lansweeper Logo Lansweeper

Sales Engineer

Cloud • Information Technology • Software
Hybrid
Austin, TX, USA
404 Employees

Lansweeper Logo Lansweeper

Sales Executive

Cloud • Information Technology • Software
Remote or Hybrid
Austin, TX, USA
404 Employees

Liberty Mutual Insurance Logo Liberty Mutual Insurance

Principal Software Engineer

Artificial Intelligence • Fintech • Insurance • Marketing Tech • Software • Analytics
Hybrid
Plano, TX, USA
40000 Employees
117K-225K Annually

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account