NextHire Consulting Jobs

Algotale-Senior AI/ML Solution Architect - Generative AI & Agentic Systems

NextHire Consulting

Algotale-Senior AI/ML Solution Architect - Generative AI & Agentic Systems

Reposted 13 Days Ago

Be an Early Applicant

Hiring Remotely in India

Remote

Senior level

Artificial Intelligence • HR Tech • Professional Services • Software

The Role

Lead design and implementation of enterprise-scale generative AI and agentic systems. Architect multi-agent orchestration, RAG and inference pipelines, fine-tuning/quantization strategies, integrations (APIs, webhooks, connectors), and secure data/model management. Optimize deployments across cloud, edge, and on-premises while establishing evaluation, selection, and lifecycle practices.

Summary Generated by Built In

Senior AI/ML Solution Architect - Generative AI & Agentic Systems

Algotale is a premier IT staffing and software solutions provider, delivering top-tier talent and custom-built technology to drive business success. With a strong network of skilled professionals across software development, cloud solutions, and project management, we help companies scale efficiently and execute projects seamlessly. Our flexible engagement models cater to both short-term and long-term needs, ensuring precision-matched expertise for every requirement. From IT staffing to full-cycle software development, Algotale empowers businesses with innovative, high-impact solutions.

Position Overview

We are looking for a Senior AI/ML Solution Architect with deep expertise in Generative AI and agentic systems to lead the design and implementation of enterprise-scale AI solutions. This role requires a unique blend of hands-on technical expertise in both Large Language Models (LLMs) and Small Language Models (SLMs), combined with the architectural vision to deploy these solutions across diverse computing environments.

The ideal candidate will architect scalable agentic solutions, implement advanced fine-tuning strategies, and design comprehensive integration systems that connect AI capabilities with enterprise applications. You will be at the forefront of our AI transformation initiatives, working with cutting-edge technologies while maintaining a practical approach to deployment and optimization.

Experience Requirements

Overall Experience: 8+ years in technology and software development

Generative AI Experience: 2+ years of hands-on experience with LLMs and generative AI systems

Solution Architecture Experience: 4+ years architecting enterprise-scale solutions

Key Responsibilities

Architecture & Design

Design and architect scalable agentic solutions using advanced LLM capabilities

Implement Model Context Protocol (MCP) integrations to connect applications with diverse external

services and APIs

Develop multi-agent orchestration systems for complex workflow automation

Design context and memory management systems for persistent agent interactions

Technical Implementation

Build and optimize Retrieval-Augmented Generation (RAG) systems for efficient knowledge retrieval

Implement agent frameworks (LangChain, LangGraph, Semantic Kernel, Agno) for various deployment environments

Design and deploy model inference pipelines optimized for different computing environments (cloud, edge, on-premises)

Develop comprehensive fine-tuning strategies for both Large Language Models (LLMs) and Small Language Models (SLMs)

Architect SLM deployment strategies for resource-constrained environments

Implement model compression and quantization techniques for efficient inference

Integration & Connectivity

Architect REST/gRPC/GraphQL APIs and SDK integrations for seamless service connectivity

Implement event-driven architectures using webhooks and message buses

Design secure authentication and authorization systems (SSO/OIDC)

Build connectors for popular platforms (Slack, Jira, Salesforce, CRM/ERP systems)

Data & Model Management

Design comprehensive data preprocessing pipelines including cleaning, deduplication, and PII reduction

Implement embedding creation and re-embedding strategies for optimal retrieval

Develop chunking and windowing strategies for mobile-optimized content processing

Establish model selection criteria and evaluation frameworks

Required Technical Skills

Core AI/ML Expertise

Foundation Models: Deep experience with GPT-4, Claude, LLaMA, and other state-of-the-art LLMs

Small Language Models (SLMs): Expertise in deploying and optimizing SLMs (Phi-3, Gemma, TinyLlama) for mobile environments

Agent Frameworks: Proficiency in LangChain, LangGraph, Microsoft Semantic Kernel, Agno, and custom agent development

RAG Systems: Advanced knowledge of retrieval-augmented generation, vector databases, and semantic search

Fine-tuning & Adaptation

Advanced fine-tuning techniques: LoRA/QLoRA, DoRA, AdaLoRA for parameter-efficient training

Model compression: Pruning, quantization (INT8/INT4), knowledge distillation

Prompt-tuning, adapters, prefix tuning, and P-tuning v2 methodologies

RLHF/RLAIF techniques for alignment and preference learning

Domain-specific fine-tuning for mobile use cases and vertical applications

Deployment & Optimization

SLM Deployment: Expertise in deploying Small Language Models across various computing environments

Multi-Platform Optimization: Experience optimizing both LLMs and SLMs for cloud, edge, and on- premises deployment

Efficient Inference: Knowledge of quantization (GPTQ, AWQ, GGML), pruning, and distillation techniques

Model Compression: Advanced techniques for reducing model size while maintaining performance

Real-time Processing: Expertise in streaming inference and adaptive reasoning depth control

Performance Optimization: Proficiency in autoscaling, rate limiting, and resource management

Adaptive Fine-tuning

Environment-specific model adaptation and optimization

Federated learning approaches for distributed fine-tuning

Few-shot and zero-shot learning techniques for resource-efficient adaptation

Integration Technologies

MCP Implementation: Deep understanding of Model Context Protocol for service integration

API Development: Expertise in designing and implementing REST, gRPC, and GraphQL APIs

Event Systems: Experience with event buses, webhooks, and real-time communication

Security: Knowledge of secure storage, caching, and access control systems

Development Frameworks

Libraries: TensorFlow, PyTorch, Hugging Face Transformers, LlamaIndex

Application Development: Web frameworks, desktop applications, API development

Cloud Platforms: AWS, GCP, Azure with focus on AI/ML services

DevOps: CI/CD pipelines, containerization (Docker/Kubernetes), monitoring

Preferred Qualifications

Master's or PhD in Computer Science, AI, Machine Learning, or related field

Published research or contributions to open-source AI/ML projects

Experience with multi-modal models and cross-modal applications

Knowledge of MLOps best practices and model lifecycle management

Experience with regulatory compliance in AI systems (GDPR, AI Act, etc.)

Track record of leading AI transformation initiatives in enterprise environments

Certifications in cloud platforms (AWS, GCP, Azure) with focus on AI/ML services

Technical Competencies to Be Assessed

System design and architecture for distributed AI systems

Code review and optimization for production AI deployments

Performance benchmarking and model evaluation methodologies

Cost optimization strategies for large-scale AI deployments

Security and privacy considerations in AI systems

Scalability patterns for AI applications

Skills Required

8+ years in technology and software development
2+ years hands-on experience with LLMs and generative AI systems
4+ years architecting enterprise-scale solutions
Experience with foundation models (GPT-4, Claude, LLaMA) and SLMs (Phi-3, Gemma, TinyLlama)
Proficiency with agent frameworks (LangChain, LangGraph, Microsoft Semantic Kernel, Agno)
Design and implementation of RAG systems, embedding strategies, and vector DB-backed semantic search
Advanced fine-tuning and adaptation techniques (LoRA/QLoRA, DoRA, AdaLoRA, RLHF/RLAIF)
Model compression and quantization expertise (pruning, INT8/INT4, GPTQ, AWQ, GGML)
API and integration development experience (REST, gRPC, GraphQL, webhooks, event buses)
Experience deploying and optimizing models across cloud, edge, and on-premises (AWS, GCP, Azure)
Experience with ML libraries and tools (TensorFlow, PyTorch, Hugging Face Transformers, LlamaIndex)
Containerization and orchestration (Docker, Kubernetes) and CI/CD for ML deployments
Designing secure authentication and authorization (SSO/OIDC) and secure data handling (PII reduction)
Experience building connectors/integrations for platforms like Slack, Jira, Salesforce, CRM/ERP
Master's or PhD in CS, AI, ML or related field
Published research or open-source contributions in AI/ML
Experience with multi-modal models and MLOps/model lifecycle management
Experience with regulatory compliance for AI systems (GDPR, AI Act)
Cloud platform certifications (AWS/GCP/Azure) with AI/ML focus

View all jobs at NextHire Consulting

View NextHire Consulting Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

100 Employees

What We Do

NextHire Consulting is an AI-driven recruiting platform that streamlines the hiring process for companies. By leveraging AI agents for sourcing, screening, and interviewing, the platform enables teams to focus on pre-qualified finalists. It provides data-driven insights into candidate soft skills and behavioral styles, aiming to disrupt traditional recruitment models with efficient, automated, and science-based talent acquisition solutions for businesses of all sizes.