Senior Platform & SRE Engineer (On-Prem & AI Systems)

Posted 2 Days Ago
Be an Early Applicant
Bengaluru, Bengaluru Urban, Karnataka, IND
Hybrid
Senior level
Artificial Intelligence • Enterprise Web • Software • Generative AI
The Role
Own and design infrastructure architecture for cloud, hybrid, and on-prem AI deployments. Build scalable deployments, CI/CD/GitOps pipelines, observability stacks, and reusable IaC modules. Define SLOs, run incident response and postmortems, optimize performance and cost, ensure security and compliance, and collaborate with ML, backend, and product teams to deliver reliable enterprise AI platform services.
Summary Generated by Built In
Senior Platform & SRE Engineer (On-Prem & AI Systems)

Location: Bengaluru / Hybrid

Team: Platform Engineering & Infrastructure

We are building lean, agentic AI systems and enterprise-grade developer platforms designed for IT and DevOps teams who need reliable, secure, and cost-efficient AI deployments. Our products run in cloud, hybrid, and fully on-prem environments, enabling enterprises to streamline testing, monitoring, compliance, and operational efficiency.

As a Senior Platform & SRE Engineer (On-Prem & AI Systems), you will own the infrastructure layer that powers all our AI services. You will design scalable, secure, and fault-tolerant environments, orchestrate on-prem deployments for enterprise customers, and ensure platform reliability across cloud + customer VPC setups. This role sits at the intersection of infrastructure engineering, DevOps, SRE, and AI system deployment.

You will define the platform architecture, build automation, improve observability, optimize performance, and work closely with product and ML teams to enable fast, reliable delivery of our AI-driven features.

This role is for engineers who think in systems, automate everything, and thrive in environments where reliability, security, and efficiency are non-negotiable.

What You’ll Own
  • End-to-end infrastructure architecture for cloud and on-prem deployments

  • Scalable, reproducible deployments of AI, ML, and microservice workloads

  • SRE responsibilities: uptime, SLO/SLA definitions, incident response, postmortems

  • Build and manage CI/CD pipelines, GitOps workflows, automated release processes

  • Implement observability stacks (OpenTelemetry, Prometheus, Grafana, ELK)

  • Optimize platform performance, CPU-based model serving, cost efficiency

  • Security-first infrastructure design: secrets, IAM, isolation, least-privilege access

  • Create reusable Terraform/Helm/Ansible modules

  • Collaborate with backend, ML, and product teams on platform-level decisions

  • Drive operational excellence across monitoring, reliability, and scalability

What You’ll BringMust-Have Skills
  • 5+ years in Infrastructure, SRE, or DevOps roles

  • Deep experience with on-prem deployments (VMs, proxies, firewalls, private networks)

  • Strong Terraform / Helm / Kubernetes (EKS, GKE, self-managed clusters)

  • Observability expertise: Prometheus, Grafana, OpenTelemetry

  • CI/CD expertise: GitHub Actions, GitLab CI, ArgoCD, or similar

  • Strong Linux fundamentals, networking, Docker internals

  • Experience deploying distributed microservices in production

  • Ability to debug infrastructure issues end-to-end

Great-to-Have
  • Experience supporting AI/ML workloads, model serving, vector DBs

  • Familiarity with open-source LLMs and CPU-based inference optimizations

  • Experience with air-gapped/on-prem enterprise deployment models

  • Security certifications or experience with SOC2 / enterprise compliance

  • Performance engineering, scalability tuning, load testing

Why This Role Matters

You will be one of the most critical hires in shaping our core platform- the foundation on which our agentic AI systems operate. Your work will determine how fast we can innovate, how reliably we can operate, and how securely we can deploy AI in enterprise environments.

You will directly influence:

  • Our Managed and on-prem enterprise architecture

  • Product reliability & SLAs

  • Deployment experience for customers

  • Overall developer velocity and system scalability

This is a career-defining opportunity to build a next-generation AI platform used by enterprise IT and DevOps teams globally.

Skills Required

  • 5+ years in Infrastructure, SRE, or DevOps roles
  • Deep experience with on-prem deployments (VMs, proxies, firewalls, private networks)
  • Strong Terraform, Helm, and Kubernetes experience (EKS, GKE, self-managed clusters)
  • Observability expertise: Prometheus, Grafana, OpenTelemetry, ELK
  • CI/CD expertise: GitHub Actions, GitLab CI, ArgoCD, or similar
  • Strong Linux fundamentals, networking knowledge, and Docker internals
  • Experience deploying distributed microservices in production
  • Ability to debug infrastructure issues end-to-end
  • Create reusable Terraform/Helm/Ansible modules and automation
  • Experience supporting AI/ML workloads, model serving, vector DBs
  • Familiarity with open-source LLMs and CPU-based inference optimizations
  • Experience with air-gapped/on-prem enterprise deployment models
  • Security certifications or experience with SOC2 / enterprise compliance
  • Performance engineering, scalability tuning, and load testing experience
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
19 Employees
Year Founded: 2025

What We Do

Disseqt AI provides an AI assurance platform for the full enterprise lifecycle, specializing in the testing, monitoring, and governance of agentic AI. The company enables organizations to validate AI behavior against internal policies, conduct red teaming, and maintain audit trails to ensure reliability and compliance with regulations like the EU AI Act, helping enterprises move from experimentation to production with confidence.

Similar Jobs

Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND

Augury Logo Augury

Team Lead

Artificial Intelligence • Hardware • Internet of Things • Machine Learning • Software • Manufacturing
Easy Apply
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
203 Employees

Ericsson Logo Ericsson

Integration Manager

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
88000 Employees

Ericsson Logo Ericsson

Consultant

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
In-Office
4 Locations
88000 Employees

Similar Companies Hiring

Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
LTX Thumbnail
Conversational AI • Generative AI
Jerusalem, Israel
360 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account