Department:
Engineering / Platform Reliability
Location: Austin (candidates must be based in Austin or willing to relocate for this role)
About the Role :
We are looking for a Senior Site Reliability Engineer who is passionate about building and maintaining highly available, scalable, and resilient systems. In this role you will serve as a senior engineer on the SRE team, driving reliability improvements across our production infrastructure while mentoring engineers and shaping our incident response culture.
You will partner closely with software engineering, security, and product teams to embed reliability into every stage of the development lifecycle. This is a high-impact position for someone who thrives at the intersection of software engineering and operations.
Key Responsibilities
- Design, build, and maintain production infrastructure across cloud platforms (AWS, GCP, or Azure) ensuring 99.99%+ availability targets
- Define and champion SLOs, SLIs, and error budgets; drive data-informed reliability decisions across engineering teams
- Lead incident response efforts as Incident Commander; conduct blameless post-mortems and drive remediation to completion
- Develop and maintain infrastructure-as-code (Terraform, or CloudFormation) and CI/CD pipelines for automated, repeatable deployments
- Build and improve observability platforms using tools such as Prometheus, Grafana, NewRelic,Splunk, or the ELK stack
- Automate toil reduction through custom tooling, self-healing systems, and proactive capacity planning
- Architect and operate container orchestration systems (Kubernetes, ECS) at scale with emphasis on cost efficiency and performance
- Collaborate with security teams to embed security best practices into infrastructure and deployment pipelines
- Mentor junior and mid-level SREs through code reviews, knowledge-sharing sessions, and pair-programming
- Contribute to the on-call rotation and continuously improve runbooks, alerting, and escalation procedures
Required Qualifications
- 7+ years of experience in SRE, DevOps, or platform engineering roles with progressive responsibility
- Strong proficiency in at least one programming language (Python, Go, Java, or similar) for systems-level automation and tooling
- Deep hands-on experience with at least one major cloud provider (AWS, GCP, or Azure) including networking, IAM, and managed services
- Expert-level knowledge of container orchestration (Kubernetes) and microservices architectures
- Demonstrated experience defining SLOs/SLIs and managing error budgets in production environments
- Solid understanding of distributed systems concepts: consensus algorithms, CAP theorem, eventual consistency, and fault-tolerant design
- Proficiency with infrastructure-as-code tools (Terraform, CloudFormation) and configuration management
- Experience with CI/CD platforms (Jenkins, GitHub Actions) and GitOps workflows
- Strong Linux systems administration skills and networking fundamentals (TCP/IP, DNS, load balancing, CDN)
- Proven track record of leading incident response, writing effective post-mortems, and implementing systemic fixes
Preferred Qualifications
- Familiarity with chaos engineering practices and tools
- Background in database reliability engineering (Oracle, PostgreSQL, MySQL, Redis, or Cassandra at scale)
- Hands-on experience with FinOps practices and cloud cost optimization
- Contributions to open-source SRE or infrastructure projects
- Relevant certifications (CKA, AWS Solutions Architect Professional, GCP Professional Cloud Architect)
Technical Environment
Our stack includes Kubernetes, Terraform, AWS, NewRelic,Prometheus and Grafana for monitoring, PagerDuty for on-call, GitHub Actions for CI/CD, and a mix of Java, Go and Python microservices. We are a team that values automation over manual intervention and continuously invest in reducing toil.
What We Offer
- Competitive base salary with annual performance bonuses
- Comprehensive health, dental, and vision insurance with generous employer contribution
- Flexible hybrid/remote work model
- Annual learning and development budget for conferences, certifications, and courses
- Generous PTO policy, paid parental leave, and wellness programs
- 401(k) with employer match
- Collaborative, blameless engineering culture that values continuous improvement
Equal Opportunity Statement
We are an equal opportunity employer committed to creating an inclusive environment for all employees. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or any other legally protected characteristic.
USD 111,600.00 - 186,000.00 per year
Compensation:
Compensation includes a base salary in the range of $111,600.00 - $186,000.00. The base salary may vary within the anticipated base pay range based on factors such as the ultimate location of the position and the selected candidate's knowledge, skills, and abilities. Position may be eligible for additional compensation that may include an incentive program.
Benefits:
The Company offers eligible employees the flexibility to take as much vacation with pay as they deem consistent with their duties, the company's needs, and its obligations; seven paid holidays throughout the calendar year; and up to 160 hours of paid wellness annually for their own wellness or that of family members. Employees are also eligible for additional paid time off in the form of bereavement leave, time off to vote, jury duty leave, volunteer time off, military leave, and parental leave.
EOE, including disability/vets
Skills Required
- 7+ years of experience in SRE, DevOps, or platform engineering roles with progressive responsibility
- Strong proficiency in at least one programming language (Python, Go, Java, or similar) for systems-level automation and tooling
- Deep hands-on experience with at least one major cloud provider (AWS, GCP, or Azure) including networking, IAM, and managed services
- Expert-level knowledge of container orchestration (Kubernetes) and microservices architectures
- Demonstrated experience defining SLOs/SLIs and managing error budgets in production environments
- Solid understanding of distributed systems concepts: consensus algorithms, CAP theorem, eventual consistency, and fault-tolerant design
- Proficiency with infrastructure-as-code tools (Terraform, CloudFormation) and configuration management
- Experience with CI/CD platforms (Jenkins, GitHub Actions) and GitOps workflows
- Strong Linux systems administration skills and networking fundamentals (TCP/IP, DNS, load balancing, CDN)
- Proven track record of leading incident response, writing effective post-mortems, and implementing systemic fixes
- Candidates must be based in Austin or willing to relocate for this role
- Familiarity with Prometheus, Grafana, NewRelic, Splunk, or the ELK stack for observability
- Experience operating container orchestration at scale (Kubernetes, ECS) and managing cost/performance
Cox Enterprises Compensation & Benefits Highlights
-
Retirement Support — The 401(k) includes a dollar-for-dollar match up to 6% of pay plus an additional fixed 2% company contribution with immediate vesting and auto-enrollment via Vanguard. Legacy cohorts may have different retirement arrangements, but the enhanced match is emphasized as a current standard.
-
Healthcare Strength — Multiple medical options (Core PPO, Premium PPO, HDHP + HSA) and Kaiser in CA are available, with in-network preventive care covered at 100% and openly published 2026 plan details and premiums. The program lineup extends to pharmacy, dental, vision, telehealth, and condition-specific supports.
-
Parental & Family Support — Eight weeks of paid parental leave, fertility coverage via Progyny, adoption assistance, and childcare/backup care resources complement flexible PTO and paid time off for voting, volunteering, and jury duty. These benefits are positioned to support employees across family life stages.
Cox Enterprises Insights
What We Do
For well over a century, Cox Enterprises has been shaping the future with daring ideas and values-driven thinking. Since our founding in 1898, our relentless spirit of innovation has driven us to disrupt industries and enhance the quality of life in the communities we serve. Through our major divisions — Cox Communications, Cox Automotive and Cox Farms — our people have countless opportunities to grow and make an impact in the communications and automotive industries, as well as in new ventures in agriculture, cleantech, digital media and more. As a privately-held, family-owned business, we know that people are our most valuable asset. We offer a supportive and inclusive environment with flexible career growth, amazing benefits and work-life balance at the forefront. Our mission, our ways of working and our commitment to people are what make our workplace culture remarkably flexible and resilient. Join us to build a better future and make your mark.
Why Work With Us
At our core, Cox is a technology company that values human relationships. We know people feel most empowered when their work has meaning, when they feel respected and have opportunities to grow. “Career satisfaction” is not enough at Cox — we’re here to help you find balance, live well and achieve your career goals even as they change over time.
Gallery
Cox Enterprises Teams
Cox Enterprises Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.
Every person has different working styles and preferences — and we aim to empower teams to work where they are most comfortable. Some roles require in-person work, but for those that can be performed remotely, we offer flexibility.























