It's fun to work in a company where people truly BELIEVE in what they are doing!
We're committed to bringing passion and customer focus to the business.
Key Responsibilities
Lead and mentor a small team of DevOps, MLOps, and cloud engineers; promote high performance, knowledge sharing, and continuous learning.
Architect and maintain scalable GCP infrastructure for AI agentic workloads (Vertex AI, GKE, Cloud Run, Cloud Functions, Artifact Registry, etc.).
Own DevOps and MLOps pipelines, including:
CI/CD with Google Cloud Build (primary) and integration/experience with Jenkins for hybrid/legacy workflows.
Artifact management using Artifact Registry (primary) with exposure to Nexus Repository for proxying or migration scenarios.
IaC (Terraform preferred), container orchestration, and Python scripting/automation for custom tooling, Glue jobs, or pipeline extensions.
Lead API endpoint security strategy, leveraging:
Apigee for enterprise-grade API management, policy enforcement, quota, monetization (if applicable), advanced security, and analytics.
Complementary GCP-native tools: Identity-Aware Proxy (IAP), Cloud Armor (WAF/DDoS), IAM least-privilege, OAuth 2.0/JWT/mTLS, Secret Manager, VPC Service Controls.
Zero-trust/BeyondCorp principles and threat protection for AI agent communications and customer-facing APIs.
Champion FinOps practices on GCP:
Implement cost monitoring (Cloud Billing, FinOps Hub), optimization recommendations (Recommender, Active Assist), commitment-based discounts (CUDs), budget alerts, and idle resource cleanup.
Drive cost allocation, forecasting, and cross-team accountability for high-cost AI workloads (e.g., model training/inference).
Collaborate with AI/ML engineers to productionize agentic workflows with secure, governed access to models/data.
Ensure high observability (Cloud Operations Suite, Prometheus/Grafana), resilience, and SRE practices (incident response, post-mortems).
Establish cloud governance, compliance, and disaster recovery aligned with business needs.
Required Qualifications & Experience
8+ years in DevOps, cloud engineering, or infrastructure roles, with 4+ years deep hands-on Google Cloud Platform (GCP).
Proven people leadership (5+ reports) in agile/fast-paced environments.
Strong expertise securing APIs/services on GCP, preferably with Apigee (enterprise API management, policies, analytics) alongside IAP, Cloud Armor, IAM, and mTLS.
Hands-on experience with:
CI/CD: Google Cloud Build + integration with Jenkins.
Artifact management: Artifact Registry + familiarity with Nexus Repository.
GKE/Cloud Run, monitoring/logging.
Python for automation, scripting, and tooling in DevOps/MLOps contexts.
Solid understanding of networking (VPC, Private Service Connect), security, and compliance.
Experience with AI/ML platforms (Vertex AI, Agent Builder) and MLOps for agentic systems is highly desirable.
Preferred Skills
Google Cloud certifications: Professional DevOps Engineer, Cloud Architect, Cloud Security Engineer (Apigee-related knowledge a plus).
FinOps experience or certification; familiarity with GCP FinOps Hub, Recommender, and commitment management.
Exposure to agentic AI patterns (multi-agent orchestration, RAG) and their infra requirements.
Experience in high-security or regulated environments.
If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!
Not the right fit? Let us know you're interested in a future opportunity by clicking Introduce Yourself in the top-right corner of the page or create an account to set up email alerts as new job postings become available that meet your interest!
Top Skills
What We Do
Fractal is one of the most prominent players in the Artificial Intelligence space. Fractal's mission is to power every human decision in the enterprise and brings AI, engineering, and design to help the world's most admired Fortune 500® companies. Fractal's products include Qure.ai to assist radiologists in making better diagnostic decisions, Crux Intelligence to assists CEOs, and senior executives make better tactical and strategic decisions, Theremin.ai to improve investment decisions, and Eugenie.ai to find anomalies in high-velocity data & Samya.ai to drive next-generation Enterprise Revenue Growth Management. Fractal has more than 3,000 employees across 16 global locations, including the United States, UK, Ukraine, India, Singapore, and Australia. Fractal has consistently been rated as India's best companies to work for, by The Great Place to Work® Institute, featured as a leader in Customer Analytics Service Providers Wave™ 2021, Computer Vision Consultancies Wave™ 2020 & Specialized Insights Service Providers Wave™ 2020 by Forrester Research, and recognized as an "Honorable Vendor" in 2021 Magic Quadrant™ for data & analytics by Gartner.








