IN THIS ROLE YOU CAN EXPECT TO...
- Contribute to system observability i.e implementing, improving metrics, alerting, and dashboards for better insight and faster recovery.
- Develop automation, tooling, and monitoring solutions to support high service availability.
- Partner with application and quality engineering teams to implement best practices in reliability, release automation, and testing.
- Drive operational excellence through proactive incident prevention, blameless postmortems, and capacity planning.
- Participate in on-call rotations to support critical services and ensure rapid response to incidents.
TO THRIVE IN THIS ROLE, THESE ARE THE TALENTS YOU BRING ...
- Solid experience in Python, especially for automation, tooling, and data-driven operational tasks.
- Proficiency in at least one (Java, C++, or Go).
- Strong understanding of Linux systems, cloud infrastructure (AWS, GCP, or Azure), and modern deployment practices (Docker, Kubernetes, Terraform).
- Experience with CI/CD pipelines, version control, and automated testing frameworks.
- Experience with observability tools (e.g., Prometheus, Grafana, ELK, Datadog, etc.) and log/metric analysis for diagnosing issues.
- Proven experience facilitating and documenting Critical User Journeys translating them to actionable SLA/SLO for automation.
- Demonstrated ability to collaborate with cross-functional teams and communicate clearly in high-impact situations.
- A problem-solver who approaches reliability as a shared responsibility across engineering.
- Experience writing or maintaining end-to-end or integration tests for distributed systems.
- Background in performance testing, capacity planning, or chaos engineering.
- Contributions to internal developer tooling or reliability-focused frameworks.
- Exposure to security, compliance, or change management processes in production environments.
- Relevant certifications.
HOW YOU PLAY
- Ownership over Participation- You take responsibility for achieving holistic outcomes, prioritize key objectives, and adapt quickly when situations require a different approach. You follow through even against the toughest challenges.
- Team over Stars- You are a bridge builder, establishing processes and relationships with teams outside your own. You work to rally around common goals, find win-win solutions, compromise when necessary, and help others succeed.
- Growth over Comfort- You are driven by a desire to grow and actively seek opportunities to expand your comfort zone, skills, and confidence. You embrace new challenges with curiosity, accepting discomfort and failure as opportunities to learn.
- Fairness over Popularity- You approach decisions with a scientist’s mindset, challenging your assumptions and remaining objective. You consider long-term impact rather than relying on short-term gains, proactively seek others’ perspectives, and manage emotions in decision-making.
Top Skills
What We Do
PlayOn is the all-in-one fan engagement platform for schools. Backed by KKR, our family of brands—including GoFan, NFHS Network, and MaxPreps—empowers schools with innovative solutions and exceptional service. We save administrators time so they can focus on what truly matters: supporting the students, staff, and fans who bring their programs to life.
Trusted by thousands of schools across the country, we're here to help create more instant replays, hold-your-breath moments, last-minute comebacks, and games you want to watch over and over again.
Why Work With Us
Product, potential, and people. We’re a leader in the high school event space, constantly evolving our product to meet the needs of administrators. We focus on solving real challenges, learning quickly, and creating impactful solutions. This is a growth-stage company, meaning your contributions have real impact.
Gallery








