Staff Software Engineer - Cloud Platform (Kafka)

Reposted Yesterday
Hiring Remotely in USA
Remote
136K-266K Annually
Senior level
Big Data • Cloud • Software • Analytics
The Role
Design and implement cloud infrastructure and data pipelines using GCP services, optimize performance, and collaborate with teams for seamless integration and automation.
Summary Generated by Built In
The Calix platform enables Communication Service Providers (CSPs) of all sizes to transform and future-proof their businesses. Through real-time data, automation, and actionable insights delivered via Calix One — our cloud-first, AI-powered platform — CSPs can simplify operations, collapse cost, and accelerate innovation. Calix One brings together the automation of everything and the experience of one, empowering customers to deliver differentiated subscriber experiences while driving acquisition, loyalty, and revenue growth. This is the Calix mission: to enable CSPs of all sizes to simplify, innovate, and grow, strengthening both their businesses and the communities they serve.
We’re at the forefront of a once in a generational change in the broadband industry. Join us as we innovate, help our customers reach their potential, and connect underserved communities with unrivaled digital experiences.

This is a remote based position in US.  Please note that as part of the recruitment and hiring process, there is an in-person meeting that will take place.

We are seeking a skilled and experienced Staff Cloud Platform Engineer with expertise in Kafka to join Cloud Platform team. The Staff Cloud Platform Engineer to design, deploy, operate, and optimize our Apache Kafka-based event streaming infrastructure at scale to design in Google Cloud Platform (GCP).The ideal candidate will have a strong background in DevOps practices, cloud infrastructure automation, and big data technologies. In this role you will partner closely with platform, data, and application engineering teams to ensure our Kafka clusters are reliable, performant, and secure — running natively on GCP or AWS.

Responsibilities:

  • Design, provision, and manage Apache Kafka clusters (self-managed on GCP/AWS or via Confluent Platform / MSK).

  • Configure and tune brokers, ZooKeeper/KRaft, topics, partitions, replication factors, and retention policies for high throughput and low latency.

  • Perform cluster upgrades, rolling restarts, and broker replacements with zero downtime.

  • Implement and manage Kafka Connect pipelines for data ingestion and egress across heterogeneous systems.

  • Administer Kafka Streams and ksqlDB deployments for real-time stream processing workloads.

  • Maintain Schema Registry and enforce schema governance standards across teams.

  • Define and track SLIs/SLOs for consumer lag, throughput, end-to-end latency, and broker health.

  • Design and implement cloud infrastructure using IaC – Terraform

  • Build automated deployment pipelines for Kafka configuration changes using GitOps workflows (ArgoCD, Flux).

  • Create self-service tooling and runbooks to reduce toil for development teams.

  • Automate topic provisioning, ACL management, and schema registration via APIs and CLI tooling.

  • Integrate tools like GitLab CI/CD, or Cloud Build for automated testing and deployment.

  • Ensure seamless integration of data pipelines with other GCP services like Big Query, Cloud Storage.

  • Monitor and Optimize performance, reliability, and cost of Kafka and streaming pipelines

  • Implement security best practices for GCP resources, including IAM policies, encryption, and network security.

  • Ensure Observability is an integral part of the infrastructure platforms and provides adequate visibility about their health, utilization, and cost.

  • Collaborate extensively with cross functional teams to understand their requirements; educate them through documentation/trainings and improve the adoption of the platforms/tools.

Qualifications:

  • 10+ years of overall experience in DevOps cloud engineering, or data engineering.

  • 5+ years of experience in Kafka at production scale.

  • Deep expertise in Kafka internals: replication protocol, log compaction, consumer group coordination, partition leadership, and KRaft mode

  • Proficiency with container orchestration (Kubernetes / Helm) and deploying Kafka via Strimzi, Confluent Operator, or equivalent

  • Strong understanding of networking (VPC, peering, private endpoints, DNS, load balancing) in cloud environments.

  • Hands-on experience with Kafka Connect, Schema Registry, and at least one stream processing framework (Kafka Streams, Flink, Spark Structured Streaming).

  • Proficiency in Google Cloud Platform (GCP) services, including Dataflow, Pub/Sub, Kafka, Dataproc, Big Query, and Cloud Storage.

  • Expertise in Infrastructure as Code (IaC) tools like Terraform or Cloud Deployment Manager.

  • Familiarity with data orchestration tools like Apache Airflow or Cloud Composer.

  • Experience with CI/CD tools like Jenkins, GitLab CI/CD, or Cloud Build.

  • Knowledge of containerization and orchestration tools like Docker and Kubernetes.

  • Strong scripting skills for automation (e.g., Bash, Python).

  • Experience with monitoring tools like Cloud Monitoring, Prometheus, and Grafana.

  • Familiarity with logging tools like Cloud Logging or ELK Stack.

  • Strong problem-solving and analytical skills.

  • Excellent communication and collaboration abilities.

  • Ability to work in a fast-paced, agile environment.

#LI-Remote

The base pay range for this position varies based on the geographic location. More information about the pay range specific to candidate location and other factors will be shared during the recruitment process. Individual pay is determined based on location of residence and multiple factors, including job-related knowledge, skills and experience.

San Francisco Bay Area:

156,400 - 265,700 USD Annual

All Other US Locations:

136,000 - 231,000 USD Annual

As a part of the total compensation package, this role may be eligible for a bonus. For information on our benefits click here.

Skills Required

  • 10+ years of overall experience in DevOps cloud engineering, or data engineering
  • 5+ years of experience in DevOps, cloud engineering, or data engineering
  • Proficiency in Google Cloud Platform (GCP) services
  • Expertise in Infrastructure as Code (IaC) tools like Terraform
  • Strong experience with Looker, Tableau, or ThoughtSpot administration
  • Knowledge of real-time data streaming technologies (Apache Kafka, Pub/Sub)
  • Familiarity with data orchestration tools (Apache Airflow, Cloud Composer)
  • Strong proficiency in SQL query optimization
  • Experience with CI/CD tools (Jenkins, GitLab CI/CD, Cloud Build)
  • Knowledge of containerization and orchestration tools (Docker, Kubernetes)
  • Strong scripting skills for automation (Bash, Python)
  • Experience with monitoring tools (Cloud Monitoring, Prometheus, Grafana)
  • Familiarity with logging tools (Cloud Logging, ELK Stack)
  • Strong problem-solving and analytical skills
  • Excellent communication and collaboration abilities
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Jose, CA
1,618 Employees
Year Founded: 1999

What We Do

Innovative communications service providers rely on Calix platforms to help them master and monetize the complex infrastructure between their subscribers and the cloud. Calix is the leading global provider of the cloud and software platforms, systems, and services required to deliver the unified access network and smart premises of tomorrow. Our platforms and services help our customers build next generation networks by embracing a DevOps operating model, optimize the subscriber experience by leveraging big data analytics, and turn the complexity of the smart home and business into new revenue streams.

Similar Jobs

Capital One Logo Capital One

Lead Software Engineer

Fintech • Machine Learning • Payments • Software • Financial Services
Remote or Hybrid
McLean, VA, USA
55000 Employees
209K-262K Annually

Capital One Logo Capital One

Work From Home Dealer Lien Perfection Sr. Coordinator

Fintech • Machine Learning • Payments • Software • Financial Services
Remote or Hybrid
Plano, TX, USA
55000 Employees
50K-50K Annually

Trail of Bits Logo Trail of Bits

Security Engineer

Artificial Intelligence • Blockchain • Professional Services • Security • Consulting • Cybersecurity • Defense
Remote
United States
125 Employees
100K-200K Annually

Leader Bank Logo Leader Bank

Business Development Manager

Fintech • Insurance • Payments • Social Impact • Financial Services
Remote or Hybrid
United States
420 Employees
72K-108K Annually

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York City, NY
100 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account