Architect - Machine Learning (MLOps Specialist)

Posted 4 Days Ago
Be an Early Applicant
2 Locations
In-Office
Senior level
Artificial Intelligence • Big Data • Machine Learning
The Role
Lead the design and implementation of MLOps strategies, oversee enterprise-grade ML/LLM pipelines, and collaborate with cross-functional teams to deliver scalable AI/ML solutions.
Summary Generated by Built In

While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth.
If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi!

Role : Architect - Machine Learning

Experience: 7-14 Years

Location: Mumbai/Bangalore

Must have skills & Qualifications:

  • 8+ years working in ML/AI engineering or MLOps roles with strong architecture exposure.

  • Strong expertise in AWS cloud-native ML stack, including: EKS (primary), ECS, Lambda, API Gateway, CI/CD (CodeBuild/CodePipeline or equivalent)

  • Hands-on experience with at least one major MLOps toolset and awareness of alternatives: MLflow, Kubeflow, SageMaker Pipelines, Airflow, BentoML, KServe, Seldon

  • Deep understanding of model lifecycle management (training → registry → deployment → monitoring).

  • Experience implementing or supporting LLMOps pipelines, including: prompt versioning, evaluation metrics, automation frameworks

  • Deep understanding of ML lifecycle: data ingestion, feature engineering, training, evaluation, model packaging, CI/CD, drift detection, monitoring, and governance.

  • Strong experience with AWS SageMaker (Training, Processing, Batch Transform, Pipelines, Feature Store, Model Registry, Model Monitor).

  • Experience implementing ML CI/CD pipelines including automated training, testing, validation, model promotion, and endpoint deployment.

  • Ability to build dynamic and versioned pipelines using SageMaker Pipelines, Step Functions, or Kubeflow.

  • Strong SQL and data transformation experience using Snowflake, Databricks, Spark, or EMR.

  • Experience with feature engineering pipelines and Feature Store management (SageMaker or Feast).

  • Understanding of lineage tracking: training data snapshot, feature versions, code versioning, metadata tracking, reproducibility.

  • Hands-on experience with Bedrock, OpenAI, Anthropic, or Llama models.

  • Experience with CloudWatch, SageMaker Model Monitor, Prometheus/Grafana, or Datadog.

  • Strong foundation in Python and cloud-native development patterns.

  • Solid understanding of security best practices, IAM, secrets management, and artifact governance.

Good to have skills:

  • Experience with vector databases, RAG pipelines, or multi-agent AI systems.

  • Exposure to DevOps and infrastructure-as-code (Terraform, Helm, CDK).

  • Hands-on understanding of model drift detection, A/B testing, canary rollouts, and blue-green deployments.

  • Familiarity with Observability stacks (Prometheus, Grafana, CloudWatch, OpenTelemetry).

  • Knowledge of Lakehouse (Delta/Iceberg/Hudi) architecture.

  • Ability to translate business goals into scalable AI/ML platform designs.

  • Strong communication and cross-team collaboration skills.

  • Ability to guide engineering teams through technical uncertainty and design choices.

Key Responsibilities:

  • Architect and implement the MLOps strategy for the EVOKE Phase-2 programme, ensuring alignment with the project proposal and delivery roadmap.

  • Design and own enterprise-grade ML/LLM pipelines covering model training, validation, deployment, versioning, monitoring, and CI/CD automation.

  • Build container-oriented ML platforms (EKS-first) while evaluating alternative orchestration tools with similar capabilities (Kubeflow, SageMaker, MLflow, Airflow, etc.).

  • Implement hybrid MLOps + LLMOps workflows, including prompt/version governance, evaluation frameworks, and monitoring for LLM-based systems.

  • Serve as a technical authority across multiple internal and customer projects, not limited to EVOKE, contributing architectural patterns, best practices, and reusable frameworks.

  • Enable observability, monitoring, drift detection, lineage tracking, and auditability across ML/LLM systems.

  • Collaborate with cross-functional teams — data engineering, platform, DevOps, and client stakeholders — to deliver production-ready ML solutions.

  • Ensure all solutions adhere to security, governance, and compliance expectations, particularly around handling cloud services, Kubernetes workloads, and MLOps tools.

  • Conduct architecture reviews, troubleshoot complex ML system issues, and guide teams through implementation across cloud-native ML platforms.

  • Mentor engineers and provide guidance on modern MLOps tools, platform capabilities, and best practices.

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

Top Skills

Airflow
Anthropic
Api Gateway
AWS
Bedrock
Bentoml
Cdk
Ci/Cd
Cloudwatch
Databricks
Datadog
Ecs
Eks
Emr
Grafana
Helm
Kserve
Kubeflow
Lambda
Llama
Mlflow
Openai
Prometheus
Python
Sagemaker
Seldon
Snowflake
Spark
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Marlborough, MA
3,494 Employees
Year Founded: 2013

What We Do

Quantiphi is an award-winning AI-first digital engineering company driven by the desire to solve transformational problems at the heart of business.
Quantiphi solves the toughest and complex business problems by combining deep industry experience, disciplined cloud, and data-engineering practices, and cutting-edge artificial intelligence research to achieve quantifiable business impact at unprecedented speed.

Similar Jobs

CrowdStrike Logo CrowdStrike

Engineering Manager

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
18 Locations
10000 Employees

CrowdStrike Logo CrowdStrike

Senior Software Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
16 Locations
10000 Employees

CrowdStrike Logo CrowdStrike

NetSuite Support Administrator

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
MH, IND
10000 Employees

CrowdStrike Logo CrowdStrike

Sr. Tools Engineer/Collaboration Services/Google Workspace/Slack (Remote, IND)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
MH, IND
10000 Employees

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account