Senior Software Engineer

Posted 2 Days Ago
Be an Early Applicant
2 Locations
In-Office
Senior level
Cloud • Information Technology • Consulting
The Role
Provide 24/7 production support and on-call ownership for Kafka clusters; monitor performance; troubleshoot producers, consumers, brokers, security (TLS/SASL), schema/serialization, and performance issues; analyze logs/metrics; onboard applications; act as SME during incidents and drive resilience improvements.
Summary Generated by Built In

Who We Are

At Kyndryl, we run and reimagine the mission-critical technology systems that drive advantage for the world’s leading businesses.  We are at the heart of progress; with proven expertise and a continuous flow of AI-powered insight, enabling smarter decisions, faster innovation, and a lasting competitive edge. For our people—Kyndryls—that means doing purposeful work that powers human progress. Join us and experience a flexible, supportive environment where your well-being is prioritized and your potential can thrive.


The Role

Role Summary

A Senior Kafka Operations Engineer is responsible for ensuring the stability, performance, and reliability of Kafka-based data streaming platforms in production. The role focuses on end-to-end operational support, advanced troubleshooting, and enabling development teams to build resilient, high-performing Kafka integrations.

This is a hands-on operational role, working in a 24/7 support environment, with deep involvement in Kafka clients, cluster health, and real-time incident resolution.

What You Will Do

  • Own production support for Kafka environments in a 24/7 on-call rotation
  • Monitor and maintain Kafka cluster performance, availability, and reliability
  • Perform advanced troubleshooting across the full Kafka stack: 
    • Producers, consumers, brokers, and clusters
  • Analyze logs and metrics to proactively detect and resolve issues
  • Ensure minimal downtime and uninterrupted data flow

Deep-Dive Troubleshooting Areas

  • Kafka Clients
    • Producer delivery failures, retries, idempotence, acknowledgments
    • Consumer lag, offset issues, delivery guarantees
  • Connectivity & Security
    • TLS handshake failures
    • SASL authentication issues
  • Schema & Serialization
    • Schema compatibility problems
    • Serializer/deserializer failures
  • Performance
    • Slow producers/consumers
    • Throughput bottlenecks (e.g., compression, batching)
  • Cluster Health
    • Partition hot spots
    • Broker performance issues
    • Replication/reliability concerns

Collaboration & Impact

  • Support and guide development teams on Kafka best practices
  • Help onboard applications onto Kafka
  • Act as a subject matter expert during incidents and root cause analysis
  • Improve system resilience and operational efficiency

What Makes This a Senior Role

  • Deep understanding of distributed systems and Kafka internals
  • Ability to troubleshoot complex, multi-layer issues under pressure
  • Strong communication with both engineering and non-engineering teams
  • Ownership of business-critical production environments


Who You Are

Role Summary

A Senior Kafka Operations Engineer is responsible for ensuring the stability, performance, and reliability of Kafka-based data streaming platforms in production. The role focuses on end-to-end operational support, advanced troubleshooting, and enabling development teams to build resilient, high-performing Kafka integrations.

This is a hands-on operational role, working in a 24/7 support environment, with deep involvement in Kafka clients, cluster health, and real-time incident resolution.

What You Will Do

  • Own production support for Kafka environments in a 24/7 on-call rotation
  • Monitor and maintain Kafka cluster performance, availability, and reliability
  • Perform advanced troubleshooting across the full Kafka stack: 
    • Producers, consumers, brokers, and clusters
  • Analyze logs and metrics to proactively detect and resolve issues
  • Ensure minimal downtime and uninterrupted data flow

Deep-Dive Troubleshooting Areas

  • Kafka Clients
    • Producer delivery failures, retries, idempotence, acknowledgments
    • Consumer lag, offset issues, delivery guarantees
  • Connectivity & Security
    • TLS handshake failures
    • SASL authentication issues
  • Schema & Serialization
    • Schema compatibility problems
    • Serializer/deserializer failures
  • Performance
    • Slow producers/consumers
    • Throughput bottlenecks (e.g., compression, batching)
  • Cluster Health
    • Partition hot spots
    • Broker performance issues
    • Replication/reliability concerns

Collaboration & Impact

  • Support and guide development teams on Kafka best practices
  • Help onboard applications onto Kafka
  • Act as a subject matter expert during incidents and root cause analysis
  • Improve system resilience and operational efficiency

What Makes This a Senior Role

  • Deep understanding of distributed systems and Kafka internals
  • Ability to troubleshoot complex, multi-layer issues under pressure
  • Strong communication with both engineering and non-engineering teams
  • Ownership of business-critical production environments


Being You

The “Kyn” in Kyndryl means kinship, which represents the strong bonds we have with each other, our customers and our communities. We focus on ensuring all Kyndryls feel included and we welcome people of all cultures, backgrounds, and experiences. Even if you don’t meet every requirement, we encourage you to apply. We believe in growth, and we’re excited to see what you can bring. At Kyndryl, employee feedback has told us that our number one driver of employee engagement is belonging. That sense of belonging — being a valued, respected, trusted member of the team — is fundamental to our culture and fueling great experiences for our customers. This dedication to welcoming everyone into our company means that Kyndryl gives you the ability to thrive and contribute to our culture of empathy and shared success. That’s The Kyndryl Way.

What You Can Expect

Your career with us isn’t just a job—it’s an adventure with purpose.  We offer a dynamic, hybrid-friendly culture that supports your well-being and empowers you to grow. Our Be Well programs are thoughtfully designed to support your financial, mental, physical, and social health—because we know that when you feel your best, you do your best.
From your very first day, you’ll dive into impactful work that powers the systems our customers rely on every day. You won’t just contribute—you’ll make a difference, tackling meaningful projects that sharpen your skills and fuel your growth.
We’re here to champion your journey. With powerful tools to chart your career path, personalized development goals aligned with your ambitions, and continuous feedback to keep you inspired and on track, you’ll have everything you need to thrive and evolve. You’ll develop in-demand skills to grow your career and achieve your ambitions with access to cutting-edge learning opportunities—from certifications with Microsoft, Google, and Amazon to coaching and hands-on experiences. And through it all, you’ll be part of a culture that values empathy, restless learning, and a devotion to shared success.
We want you to thrive here—and we’re committed to helping you do just that. Ready to make an impact? Join us and help shape what’s next.

Get Referred!

If you know someone that works at Kyndryl, when asked ‘How Did You Hear About Us’ during the application process, select ‘Employee Referral’ and enter your contact's Kyndryl email address.

Skills Required

  • Production support for Kafka environments in a 24/7 on-call rotation
  • Deep understanding of distributed systems and Kafka internals
  • Experience troubleshooting Kafka clients (producers and consumers), brokers, and clusters
  • Experience with Kafka security and connectivity: TLS handshake and SASL authentication troubleshooting
  • Experience diagnosing schema and serialization issues (serializer/deserializer failures, schema compatibility)
  • Experience analyzing logs and metrics to detect and resolve issues and maintain cluster health
  • Ability to act as subject matter expert during incidents and perform root cause analysis
  • Strong communication skills to collaborate with engineering and non-engineering teams and guide best practices

Kyndryl Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Kyndryl and has not been reviewed or approved by Kyndryl.

  • Fair & Transparent Compensation Pay is considered good in some roles, with mentions of “good pay,” “great pay and benefits,” and an “acceptable salary range” paired with bonuses. Certain senior or consulting tracks are described as market-competitive.
  • Leave & Time Off Breadth Vacation, paid time off, holidays, and a dedicated volunteer day are highlighted as positives. Parental leave exists companywide alongside sick leave and disability coverage.
  • Wellbeing & Lifestyle Benefits Remote and hybrid flexibility is emphasized, including 100% remote roles and a formal flexible workplace policy. Well‑being resources such as the Be Well program and an EAP are available.

Kyndryl Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York City, NY
46,070 Employees
Year Founded: 2021

What We Do

We have the world’s best talent that design, run, and manage the most advanced and reliable technology infrastructure each day. Together, we think holistically about the health of these vital technology ecosystems. We are a focused, independent company that builds on our foundation of excellence by creating systems in new ways. Bringing in the right partners, investing in our business, and working side-by-side with our customers to unlock potential. We're raising the bar. Our experience speaks for itself: We have 90,000 highly skilled employees around the world serving 75 of the Fortune 100. But our purpose is what drives us: Advancing the vital systems that power human progress. Because when a digital ecosystem is healthy, it can more readily adapt and support continuous growth and that opens up a world of possibility for everyone.

Similar Jobs

Toast Logo Toast

Senior Software Engineer

Cloud • Fintech • Food • Information Technology • Software • Hospitality
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
5000 Employees

Wells Fargo Logo Wells Fargo

Senior Software Engineer

Fintech • Financial Services
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
205000 Employees

Wells Fargo Logo Wells Fargo

Senior Software Engineer

Fintech • Financial Services
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
205000 Employees

Wells Fargo Logo Wells Fargo

Senior Software Engineer

Fintech • Financial Services
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
205000 Employees

Similar Companies Hiring

Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account