Platform Engineer - DevOps

Posted Yesterday
Be an Early Applicant
2 Locations
In-Office
Senior level
Artificial Intelligence • Fintech • Analytics • Conversational AI
The Role
Own and operate deployment infrastructure and CI/CD across AWS and GCP using IaC. Build and run the observability stack, define on-call and runbooks, manage multi-cloud cost and capacity, partner with application teams, and lead postmortems and reliability improvements for production systems.
Summary Generated by Built In

GreyLabs AI is building the voice operating system for India's BFSI. Our Agentic Voice AI platform helps banks, insurers, NBFCs, and fintechs automate and humanise millions of customer conversations - across sales, collections, customer service, and compliance - in multiple Indian languages.

In under two years, we've scaled to 50+ enterprise clients, including RBL Bank, AU Small Finance Bank, IDFC FIRST Bank, SBI Life, ICICI Prudential Life, Motilal Oswal - processing hundreds of millions of conversations. We raised ₹85 Crores in Series A funding led by Elevation Capital with Z47, and were recognised for "Best Use of AI in Fintech" at IFTA 2025.

The Role

This is an SDE3 role on the Platform team. You will own deployment infrastructure, build the observability stack, and manage multi-cloud environments across AWS and GCP. Ownership is wide, the infrastructure you build is load-bearing, and when something breaks in production, you are the person who fixes it and closes the gap so it doesn’t break the same way again.

What You’ll Do

Infrastructure & Deployment

  • Design and maintain infrastructure as code across AWS and GCP - Terraform or equivalent, versioned and reviewed like application code

  • Own the CI/CD pipeline and build deployment automation 

  • Manage multi-cloud cost attribution, capacity planning, and environment parity

Observability & Reliability

  • Own the observability stack (SigNoz, Prometheus, Grafana) - define what gets monitored, what gets alerted on, and what gets filtered out

  • Define on-call practices, maintain runbooks the team can execute under pressure, and lead postmortems that produce structural fixes

  • Partner with application engineers to abstract infrastructure complexity

What We’re Looking For
  • 6-8 years of experience in DevOps, SRE, Platform Engineering, or Software Engineering with significant cloud operations/automation ownership.

  • Has operated at least one stateful distributed system in production - Kafka, ClickHouse, Cassandra, Elasticsearch, or equivalents. 

  • Experience with AWS, GCP or Azure

  • Has built or significantly owned an observability stack with deliberate signal-over-noise decisions

  • Writes infrastructure as code by default; treats infra changes with the same review rigour as application code

  • Writes automation and tooling code fluently - Python, Go, or bash

  • Has managed infrastructure in a regulated or enterprise B2B environment - IAM, network isolation, audit logging, or access control requirements that came from a compliance or security driver, not just engineering preference

  • Has been on-call for a system where downtime had real consequences, and has the instincts that come from that

Strong Signals

Not checkboxes. Signals that tell us you’re the right person.

  • You’ve operated Data Engineering pipelines (Clickhouse/Debezium/Spark) - cluster tuning, materialized view debugging, slow query diagnosis at scale

  • You’ve managed Distributed Systems like Kafka (or equivalent) through a real failure - broker loss, consumer group rebalancing, lag that wouldn’t clear - and can walk through exactly what happened, what you changed, and what the runbook looks like now

  • Your Observability Stack has caught a production issue before a customer opened a ticket

  • You’ve reduced infrastructure cost as a deliberate exercise - not a side effect - and can quantify what changed

Why GreyLabs AI
  • A hard problem in a large market. Building low-latency, multilingual Voice AI for regulated financial institutions - across diverse Indian languages and under RBI and IRDAI compliance requirements - is technically complex and commercially consequential.

  • Real scale, real engineering challenges. The reliability, cost, and infrastructure challenges here reflect actual production load.

  • Scope to shape the technical direction. At our current stage, architectural decisions move quickly from design to production. Senior engineers have direct influence on how the platform is built and what it becomes.

  • Strong backing, proven team. Elevation Capital and Z47 are long-term partners invested in our vision. Our founders built and exited Cogno AI - they understand what it takes to build AI companies that earn enterprise trust.

Skills Required

  • 6-8 years of experience in DevOps, SRE, Platform Engineering, or Software Engineering with cloud operations/automation ownership
  • Operated at least one stateful distributed system in production (Kafka, ClickHouse, Cassandra, Elasticsearch, or equivalent)
  • Experience with AWS, GCP or Azure
  • Built or significantly owned an observability stack (SigNoz, Prometheus, Grafana) and made signal-over-noise decisions
  • Writes infrastructure as code by default (Terraform or equivalent) and treats infra changes with code review rigor
  • Writes automation and tooling code fluently (Python, Go, or Bash)
  • Managed infrastructure in a regulated or enterprise B2B environment (IAM, network isolation, audit logging, access control driven by compliance/security)
  • Has been on-call for production systems with real-consequence downtime and can run postmortems and structural fixes
  • Operated data engineering pipelines (ClickHouse/Debezium/Spark) — cluster tuning and debugging at scale
  • Managed distributed system failures (e.g., Kafka broker loss, consumer rebalances) and improved runbooks/processes
  • Observability stack that detected production issues before customer tickets
  • Demonstrated deliberate infrastructure cost reduction with quantifiable results
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
0 Employees
Year Founded: 2023

What We Do

GreyLabs AI is an Agentic Voice AI platform transforming how financial institutions, including banks, fintechs, insurance companies, and broking firms, engage with their customers. The company specializes in automating contact centers using Voice AI Agents capable of sales, support, and collections across multiple Indian languages, blending speech analytics and conversational intelligence to drive higher conversions and lower operating costs.

Similar Jobs

Fractal Logo Fractal

Platform Engineer

Artificial Intelligence • Consulting
In-Office
6 Locations
5262 Employees

Pfizer Logo Pfizer

Senior Health Care Executive

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
In-Office
Mumbai, Maharashtra, IND
121990 Employees

Morningstar Logo Morningstar

Quality Assurance Automation Engineer

Artificial Intelligence • Big Data • Enterprise Web • Fintech • Software • Financial Services
Hybrid
Navi Mumbai, Thane, Maharashtra, IND
11500 Employees

HERE Technologies Logo HERE Technologies

Senior Software Engineer

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Hybrid
Mumbai, Maharashtra, IND
6000 Employees

Similar Companies Hiring

Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
LTX Thumbnail
Conversational AI • Generative AI
Jerusalem, Israel
360 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account