Machine Learning Engineer, Vision

Posted Yesterday
Be an Early Applicant
Bengaluru, Bengaluru Urban, Karnataka, IND
In-Office
Mid level
Artificial Intelligence • Software
The Role
Build and deploy vision-language models across the full lifecycle: data pipelines, training/fine-tuning on GPU clusters, evaluation and benchmarks, inference optimization, production integrations, and client-facing solutions for document processing and visual search.
Summary Generated by Built In
About Sarvam

Sarvam is building the bedrock of Sovereign AI for India. The company is developing India's full-stack sovereign AI platform, building across research, models, infrastructure and applications with a singular focus on making AI genuinely work for India. Sarvam works with leading enterprises and public institutions and is backed by Lightspeed, Peak XV, and Khosla Ventures. Sarvam partners with India's leading brands, including Tata Capital, SBI Life, CRED, IDFC, and LIC.

About the Role

You will work across the full lifecycle of vision-language model (VLM) development — data, training, evaluation, and production. The team's scope will evolve as the field does; we want engineers who are comfortable with that.

What You'll Do
  • Design and run training and fine-tuning pipelines for large vision-language models on GPU clusters

  • Build multimodal data pipelines — ingestion, filtering, deduplication, synthetic generation, and quality assurance

  • Implement and experiment with new architectures and training techniques from research

  • Build evaluation harnesses, benchmarks, and automated regression tracking

  • Optimise models for inference — quantisation, batching, and serving infrastructure

  • Build robust pipelines and integrations that put vision model capabilities in the hands of end users

  • Translate real-world problems into well-scoped ML tasks with the right data and evaluation strategy

  • Work directly with clients to understand their use cases — document processing, visual search, form extraction — and own the solution end to end

  • Build production-grade systems on top of Sarvam Vision and open-source models: multimodal pipelines, retrieval-augmented workflows, and structured output extraction

  • Debug and improve deployed solutions — latency, accuracy, edge cases, and integration with client infrastructure

What We're Looking For
  • Strong Python and PyTorch — comfortable reading and modifying model internals

  • Hands-on experience training or fine-tuning large models, including debugging broken runs

  • Experience building data pipelines at scale

  • Solid grounding in transformer architectures and modern training techniques

  • Comfort with ambiguity — the roadmap is not fully pre-specified

  • Strong focus on secure coding practices, code quality, and system reliability

  • Undergraduate degree in a technical discipline (CS, statistics, physics, or equivalent)

Bonus Points
  • Experience with vision-language models or multimodal systems

  • Distributed training (FSDP, DeepSpeed, Megatron-LM)

  • Post-training methods — RLHF, DPO, or alignment techniques

  • Inference optimisation — quantisation, distillation, serving

  • Prior exposure to vision-based AI systems or document processing pipelines

  • Contributions to open-source projects or a solid GitHub portfolio

Why Sarvam?

Sarvam is a fast-moving, high talent-density team building full-stack AI for India, working on problems that push the frontiers of AI with real population-scale impact.

  • Work alongside researchers, engineers, builders, and business leaders who move fast and hold each other to a very high bar

  • High ownership and high impact, from day one

  • Everything we do is AI-first, from the way we build and ship to the way we think about problems

  • You can work on problems that could change how an entire country learns, works, and communicates

If you want to work on problems at the frontier of AI in India, Sarvam is the place to be.

Skills Required

  • Strong Python and PyTorch experience, comfortable reading and modifying model internals
  • Hands-on experience training or fine-tuning large models, including debugging failed runs
  • Experience building data pipelines at scale (ingestion, filtering, deduplication, QA)
  • Solid grounding in transformer architectures and modern training techniques
  • Strong focus on secure coding practices, code quality, and system reliability
  • Undergraduate degree in a technical discipline (CS, statistics, physics, or equivalent)
  • Experience with vision-language models or multimodal systems
  • Distributed training experience (FSDP, DeepSpeed, Megatron-LM)
  • Post-training methods or alignment techniques (RLHF, DPO)
  • Inference optimisation experience (quantisation, distillation, serving)
  • Prior exposure to vision-based AI systems or document processing pipelines
  • Contributions to open-source projects or a solid GitHub portfolio
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Bangalore, Karnataka
50 Employees
Year Founded: 2023

What We Do

We are an AI/ML research and development company on a mission to build reliable, performant, enterprise-grade AI systems at scale for India. We are committed to build the full-stack for generative AI for the rich & diverse landscape of India, mainly investing in: 1) Models: developing both efficient large scale Indic language models as well as bespoke enterprise models 2) Platform: building an enterprise-grade platform that empowers organisations to develop and ship creative and performant genAI applications at scale 3) Ecosystem: contributing to open-source models and datasets, as well as leading efforts for large scale data curation in public-good space

Similar Jobs

NextHire Consulting Logo NextHire Consulting

Machine Learning Engineer

Artificial Intelligence • HR Tech • Professional Services • Software
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
100 Employees
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
956 Employees
12-17 Annually

Stryker Logo Stryker

Principal Engineer

Healthtech • Other • Robotics • Biotech • Manufacturing
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
51000 Employees
120K-170K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account