Staff AI Ops Engineer

Reposted 18 Days Ago
2 Locations
Remote
136K-266K Annually
Senior level
Big Data • Cloud • Software • Analytics
The Role
The role involves designing and maintaining infrastructure for machine learning applications, deploying ML pipelines, optimizing resources on GCP, and ensuring system observability.
Summary Generated by Built In
The Calix platform enables Communication Service Providers (CSPs) of all sizes to transform and future-proof their businesses. Through real-time data, automation, and actionable insights delivered via Calix One — our cloud-first, AI-powered platform — CSPs can simplify operations, collapse cost, and accelerate innovation. Calix One brings together the automation of everything and the experience of one, empowering customers to deliver differentiated subscriber experiences while driving acquisition, loyalty, and revenue growth. This is the Calix mission: to enable CSPs of all sizes to simplify, innovate, and grow, strengthening both their businesses and the communities they serve.
We’re at the forefront of a once in a generational change in the broadband industry. Join us as we innovate, help our customers reach their potential, and connect underserved communities with unrivaled digital experiences.

Calix is where passionate innovators come together with a shared mission: to reimagine broadband experiences and empower communities like never before. As a true pioneer in broadband technology, we ignite transformation by equipping service providers of all sizes with an unrivaled platform, state-of-the-art cloud technologies, and AI-driven solutions that redefine what’s possible. Every tool and breakthrough we offer is designed to simplify operations and unlock extraordinary subscriber experiences through innovation.

Calix is seeking a highly skilled Staff AI Ops Engineer with hands-on experience with GCP to join our cutting-edge AI/ML team. In this role, you will be responsible for building, scaling, and maintaining the infrastructure that powers our machine learning and generative AI applications. You will work closely with data scientists, ML engineers, and software developers to ensure our ML/AI systems are robust, efficient, and production ready.

This is a remote-based position that can be located anywhere in the United States or Canada.  Please note that as part of the recruitment and hiring process, there is an in-person meeting that will take place.

Key Responsibilities:

  • Design, implement, and maintain scalable infrastructure for ML and GenAI applications

  • Deploy, operate, and troubleshoot production ML/GenAI pipelines/services

  • Build and optimize CI/CD pipelines for ML model deployment and serving

  • Scale compute resources across CPU/GPU architectures to meet performance requirements

  • Implement container orchestration with Kubernetes

  • Architect and optimize cloud resources on GCP for ML training and inference

  • Setup and maintain runtime frameworks and job management systems (Airflow, KubeFlow, MLflow, etc.)

  • Establish monitoring, logging and alerting for systems observability

  • Optimize system performance and resource utilization for cost efficiency

  • Develop and enforce AIOps best practices across the organization

Qualifications:

  • Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent experience). 

  • 8+ years of overall software engineering experience

  • 3+ years of focused experience in DevOps/AIOps or similar ML infrastructure roles

  • Proficient in IaC, using Terraform.

  • Strong experience with containerization and orchestration using Docker and Kubernetes

  • Demonstrated expertise in cloud infrastructure management on GCP

  • Proficiency with workflow management such as Airflow & Kubeflow

  • Strong CI/CD expertise with experience implementing automated testing and deployment pipelines

  • Experience with scaling distributed compute architectures utilizing various accelerators (CPU/GPU)

  • Solid understanding of system performance optimization techniques

  • Experience implementing comprehensive observability solutions for complex systems

  • Knowledge of monitoring and logging tools (Prometheus, Grafana, ELK stack).

  • Strong proficiency in Python

  • Familiarity with ML frameworks such as PyTorch and ML platforms like Vertex AI

  • Excellent problem-solving skills and ability to work independently

  • Strong communication skills and ability to work effectively in cross-functional teams

#LI-Remote

The base pay range for this position varies based on the geographic location. More information about the pay range specific to candidate location and other factors will be shared during the recruitment process. Individual pay is determined based on location of residence and multiple factors, including job-related knowledge, skills and experience.

San Francisco Bay Area:

156,400 - 265,700 USD Annual

All Other US Locations:

136,000 - 231,000 USD Annual

As a part of the total compensation package, this role may be eligible for a bonus. For information on our benefits click here.

Skills Required

  • Bachelor's degree in Computer Science or related field
  • 8+ years of software engineering experience
  • 3+ years in DevOps/AIOps or ML infrastructure roles
  • Proficient in IaC using Terraform
  • Experience with Docker and Kubernetes
  • Expertise in cloud infrastructure management on GCP
  • Proficient with Airflow & Kubeflow
  • Strong CI/CD experience
  • Experience with CPU/GPU scaling
  • Understanding of performance optimization techniques
  • Experience with observability solutions
  • Knowledge of monitoring tools (Prometheus, Grafana, ELK stack)
  • Proficiency in Python
  • Familiarity with ML frameworks (PyTorch)
  • Excellent problem-solving skills

Calix Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Calix and has not been reviewed or approved by Calix.

  • Flexible Benefits Remote‑first policies include home‑internet reimbursement, home‑office furniture support, and work‑from‑anywhere flexibility. Feedback suggests these options make the package adaptable across locations and work styles.
  • Healthcare Strength Coverage spans medical, dental, and vision for employees and dependents alongside EAP and virtual therapy/coaching. Wellbeing elements like lifestyle allowances, recharge days, and no‑internal‑meeting days further bolster health support.
  • Leave & Time Off Breadth Paid vacation, wellness days, holidays, bereavement and jury‑duty leave offer broad time‑off access. Parental/bonding and caregiver leave, plus adoption assistance and medical‑travel coverage, extend support through major life events.

Calix Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Jose, CA
1,618 Employees
Year Founded: 1999

What We Do

Innovative communications service providers rely on Calix platforms to help them master and monetize the complex infrastructure between their subscribers and the cloud. Calix is the leading global provider of the cloud and software platforms, systems, and services required to deliver the unified access network and smart premises of tomorrow. Our platforms and services help our customers build next generation networks by embracing a DevOps operating model, optimize the subscriber experience by leveraging big data analytics, and turn the complexity of the smart home and business into new revenue streams.

Similar Jobs

Remote
Canada
1300 Employees
186K-224K Annually

Block Logo Block

Product Manager

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
In-Office or Remote
8 Locations
12000 Employees
240K-359K Annually

Block Logo Block

Senior Machine Learning Engineer

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
In-Office or Remote
8 Locations
12000 Employees
161K-284K Annually

Vertafore Logo Vertafore

Product Owner

Information Technology • Insurance • Software
Remote or Hybrid
Montréal, QC, CAN
2372 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account