Site Reliability Engineer

Posted 12 Days Ago
Be an Early Applicant
Hiring Remotely in Philippines
Remote
Senior level
News + Entertainment • Software
The Role
The Site Reliability Engineer will ensure the reliability of cloud infrastructure, manage deployments with Kubernetes, and automate processes with tools like Terraform and GitHub Actions while monitoring service health.
Summary Generated by Built In
It's fun to work in a company where people truly BELIEVE in what they're doing!

We're committed to bringing passion and customer focus to the business.

ABOUT AVID

Avid makes technology and collaborative tools so creators can entertain, inform, educate and enlighten the world. Our customers are the visionaries behind the most inspiring feature films, television programs, news broadcasts, televised sporting events, music recording and live concerts. To learn how Avid powers greater creators or for more information, visit www.avid.com.

 

JOB SUMMARY:

Come and join us at Avid as a Site Reliability Engineer (Remote, Philippines), where you will play a key role in ensuring the reliability, performance, and scalability of our cloud infrastructure and production systems. You’ll work closely with cross-functional engineering teams to design resilient architectures, automate deployments, and deliver a highly available platform.

WHAT YOU WILL DO:
  • Champion and continuously improve platform reliability, observability, and DevOps culture across the engineering organization.

  • Define and track SLAs, SLOs, and SLIs to drive reliability goals and monitor service health across the platform.

  • Design, implement, and tune application and component monitoring, alerting and dashboards using Prometheus, Grafana, CloudWatch and Elastic (or similar tools)

  • Related to site reliability, you will also have the opportunity and responsibility to improve and harden core systems in conjunction with the larger cloud engineering team.  These would potentially include:

  • Design, operate, and optimize Kubernetes workloads on Amazon EKS, managing containerized applications across multiple environments.

  • Implement and maintain Istio service mesh for secure, resilient, and observable service-to-service communication within Kubernetes.

  • Build and manage GitOps pipelines using ArgoCD, ensuring Kubernetes manifests and Helm charts are deployed and audited correctly.

  • Automate CI/CD workflows with GitHub Actions, enabling fast and safe software delivery.

  • Automate infrastructure provisioning with Terraform, enabling consistent, repeatable, and auditable AWS deployments.

  • Participate in a 24/7 on-call rotation, handle incident response, perform postmortems, and maintain up-to-date runbooks.

  • Secure applications and infrastructure with tools like Snyk and follow security best practices.

  • Manage edge and DNS configurations with Cloudflare, R53, ensuring performance and global availability.

  • Operate and tune AWS services such as RDS, OpenSearch, and IAM, supporting data and identity needs.

WHAT YOU CAN DELIVER:

   Minimum Requirements:

  • Bachelor’s degree in Information Technology, Computer Science, Software Engineering, and/or other related fields.

  • 5+ years of experience in Site Reliability Engineering, DevOps, and/or equivalent.

  • Strong proficiency with Kubernetes (preferably Amazon EKS) and containerized application deployments.

  • Proficiency with observability stacks (Prometheus, Grafana, ELK) and alerting best practices.

  • Strong scripting skills (Bash, Python, or similar) for automation and tooling.

Preferred Skills, Experience, Capabilities:

  • Expertise with infrastructure-as-code tools such as Terraform, and GitOps workflows using ArgoCD.

  • Hands-on experience with AWS services including RDS, IAM, OpenSearch, CloudWatch, and Route 53.

  • Experience with CI/CD automation using GitHub Actions or similar tools.

  • Knowledge of service mesh technologies such as Istio.

Aside from the minimum requirements and preferred qualifications above, the successful candidate shall possess the following behavioral traits and technical skills:

  • Understanding of security best practices in cloud environments, including vulnerability scanning and remediation

  • Excellent troubleshooting skills, especially in distributed, cloud-native systems.

  • Strong communication skills and ability to work cross-functionally in a collaborative environment.

WHAT TO LOOK FORWARD TO:
  • Join a global team and experience a dynamic, collaborative work environment that fosters innovation and growth

  • Remote work model offering flexibility to balance work and life

  • Access to development programs with strong support and mentoring to help you grow and advance within the company

  • Attractive benefits package including health & life insurance, referral rewards, and generous leave policies to ensure a healthy work-life balance

OUTSIDE OF SCOPE:
  • Direct software feature development unrelated to infrastructure or reliability.

  • Desktop IT support, end-user troubleshooting, or helpdesk functions.

  • On-premises server hardware maintenance.

  • Manual infrastructure deployments without automation or version control.

  • Marketing, sales, or customer account management duties.

Avid is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

#LI-Remote #LI-NR1

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

Top Skills

Amazon Eks
Argocd
Aws Iam
Aws Opensearch
Aws Rds
Cloudflare
Cloudwatch
Elastic
Github Actions
Grafana
Kubernetes
Prometheus
Route 53
Snyk
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Burlington, MA
1,522 Employees
Year Founded: 1987

What We Do

We help media visionaries create art that colors our perceptions and enriches our culture. We make innovative technology and collaborative tools that inspire and spark joy so creators can entertain, inform, educate and enlighten the world.

We believe in our artists. We believe in our industry leaders. And we believe in the future of entertainment. We have a rich, 30-year history of powering media and
entertainment. But we know our history doesn’t determine our future, so we are always evolving, committed to making good better and better best.

We make many products, but we only do one thing: maximize the mediums of amazing makers. At Avid, every minute, of every day, we are powering greater creators.

Similar Jobs

Sleek Logo Sleek

Site Reliability Engineer

Fintech • Financial Services
In-Office or Remote
5 Locations
405 Employees

Omilia Logo Omilia

Senior Site Reliability Engineer

Artificial Intelligence • Conversational AI
Remote
6 Locations
354 Employees
Remote
2 Locations
1383 Employees

Gigster Logo Gigster

Site Reliability Engineer

Artificial Intelligence • Blockchain • Internet of Things • Machine Learning • Software • App development • Automation
Easy Apply
In-Office or Remote
47 Locations
127 Employees

Similar Companies Hiring

PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account