Location: Bengaluru / Hybrid
Team: Platform Engineering & Infrastructure
We are building lean, agentic AI systems and enterprise-grade developer platforms designed for IT and DevOps teams who need reliable, secure, and cost-efficient AI deployments. Our products run in cloud, hybrid, and fully on-prem environments, enabling enterprises to streamline testing, monitoring, compliance, and operational efficiency.
As a Senior Platform & SRE Engineer (On-Prem & AI Systems), you will own the infrastructure layer that powers all our AI services. You will design scalable, secure, and fault-tolerant environments, orchestrate on-prem deployments for enterprise customers, and ensure platform reliability across cloud + customer VPC setups. This role sits at the intersection of infrastructure engineering, DevOps, SRE, and AI system deployment.
You will define the platform architecture, build automation, improve observability, optimize performance, and work closely with product and ML teams to enable fast, reliable delivery of our AI-driven features.
This role is for engineers who think in systems, automate everything, and thrive in environments where reliability, security, and efficiency are non-negotiable.
What You’ll OwnEnd-to-end infrastructure architecture for cloud and on-prem deployments
Scalable, reproducible deployments of AI, ML, and microservice workloads
SRE responsibilities: uptime, SLO/SLA definitions, incident response, postmortems
Build and manage CI/CD pipelines, GitOps workflows, automated release processes
Implement observability stacks (OpenTelemetry, Prometheus, Grafana, ELK)
Optimize platform performance, CPU-based model serving, cost efficiency
Security-first infrastructure design: secrets, IAM, isolation, least-privilege access
Create reusable Terraform/Helm/Ansible modules
Collaborate with backend, ML, and product teams on platform-level decisions
Drive operational excellence across monitoring, reliability, and scalability
5+ years in Infrastructure, SRE, or DevOps roles
Deep experience with on-prem deployments (VMs, proxies, firewalls, private networks)
Strong Terraform / Helm / Kubernetes (EKS, GKE, self-managed clusters)
Observability expertise: Prometheus, Grafana, OpenTelemetry
CI/CD expertise: GitHub Actions, GitLab CI, ArgoCD, or similar
Strong Linux fundamentals, networking, Docker internals
Experience deploying distributed microservices in production
Ability to debug infrastructure issues end-to-end
Experience supporting AI/ML workloads, model serving, vector DBs
Familiarity with open-source LLMs and CPU-based inference optimizations
Experience with air-gapped/on-prem enterprise deployment models
Security certifications or experience with SOC2 / enterprise compliance
Performance engineering, scalability tuning, load testing
You will be one of the most critical hires in shaping our core platform- the foundation on which our agentic AI systems operate. Your work will determine how fast we can innovate, how reliably we can operate, and how securely we can deploy AI in enterprise environments.
You will directly influence:
Our Managed and on-prem enterprise architecture
Product reliability & SLAs
Deployment experience for customers
Overall developer velocity and system scalability
This is a career-defining opportunity to build a next-generation AI platform used by enterprise IT and DevOps teams globally.
Skills Required
- 5+ years in Infrastructure, SRE, or DevOps roles
- Deep experience with on-prem deployments (VMs, proxies, firewalls, private networks)
- Strong Terraform, Helm, and Kubernetes experience (EKS, GKE, self-managed clusters)
- Observability expertise: Prometheus, Grafana, OpenTelemetry, ELK
- CI/CD expertise: GitHub Actions, GitLab CI, ArgoCD, or similar
- Strong Linux fundamentals, networking knowledge, and Docker internals
- Experience deploying distributed microservices in production
- Ability to debug infrastructure issues end-to-end
- Create reusable Terraform/Helm/Ansible modules and automation
- Experience supporting AI/ML workloads, model serving, vector DBs
- Familiarity with open-source LLMs and CPU-based inference optimizations
- Experience with air-gapped/on-prem enterprise deployment models
- Security certifications or experience with SOC2 / enterprise compliance
- Performance engineering, scalability tuning, and load testing experience
What We Do
Disseqt AI provides an AI assurance platform for the full enterprise lifecycle, specializing in the testing, monitoring, and governance of agentic AI. The company enables organizations to validate AI behavior against internal policies, conduct red teaming, and maintain audit trails to ensure reliability and compliance with regulations like the EU AI Act, helping enterprises move from experimentation to production with confidence.








