VLM Run

ML Intern - 2026

Reposted 11 Days Ago

2 Locations

Remote or Hybrid

Internship

Artificial Intelligence • Computer Vision • Machine Learning • Software

The Role

Work on vision-language model development and infrastructure: improve VLM understanding, OCR and function-calling workflows, fine-tuning recipes, evaluation and robustness, and optimize training/serving pipelines. Own end-to-end projects from experiments to production-ready code, collaborating on scale, cost, and reliability.

Summary Generated by Built In

Join us as we build VLM Run – the enterprise infrastructure layer for visual intelligence. Our mission is to give developers a unified interface to fine-tune, specialize, and operationalize Vision-Language Models (VLMs) that turn images, PDFs, screenshots, and video into reliable, schema-true structured data for production insights and automation – built for scale, security, and SLAs.

We’re looking for exceptional ML interns (Master’s & PhD students) to help us scale the future of Visual AI. You’ll thrive here if you bring strong research and engineering skills, care about good abstractions, and are excited to ship real product.

ML/CV Development: Improve our core VLM capabilities (see Orion), including vision-language understanding, OCR + function-calling workflows, fine-tuning recipes, and robustness.
ML Infrastructure: Help optimize the VLM stack, focusing on training efficiency, evaluation pipelines, quantization/distillation, and cost-efficient serving and scaling.
High Agency: Own a scoped project end-to-end. Turn ambiguity into experiments, results, and shipped code.

Requirements

Currently pursuing a MS/PhD in CS/EE/Math or equivalent.
Strong Python skills and comfort with PyTorch.
Familiarity with Transformers/ViTs and the Hugging Face ecosystem (transformers, datasets).
Ability to read papers, reproduce results, and communicate findings clearly.

Nice to have

Experience with fine-tuning tooling (peft, trl), evaluation frameworks, or dataset curation.
Familiarity with model serving or perf work (vLLM, TensorRT/Triton, Ray, FlashAttention).
GCP/AWS, Docker, and basic MLOps experience.
Bonus: GitHub repo with 100+ stars, recent peer-reviewed paper, OSS contributions.

🗒️ Other Details

Internship Terms: Winter and Summer positions available (exact dates flexible per academic schedule).
Compensation: Paid internship, competitive with seed-stage ML startups.
Location: Santa Clara, CA – at least some in-person collaboration preferred (we’re right off 101, next to AMD’s HQ offices).

Skills Required

Currently pursuing a MS/PhD in CS/EE/Math or equivalent.
Strong Python skills.
Comfort with PyTorch.
Familiarity with Transformers/ViTs and the Hugging Face ecosystem (transformers, datasets).
Ability to read papers, reproduce results, and communicate findings clearly.
Experience with fine-tuning tooling (peft, trl), evaluation frameworks, or dataset curation.
Familiarity with model serving or performance work (vLLM, TensorRT/Triton, Ray, FlashAttention).
GCP/AWS, Docker, and basic MLOps experience.
Public OSS contributions, a high-star GitHub repo, or recent peer-reviewed paper.

View all jobs at VLM Run

View VLM Run Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

6 Employees

Year Founded: 2022

What We Do

VLM Run is an enterprise infrastructure platform for visual intelligence, providing a unified API to fine-tune, specialize, and operationalize Vision Language Models (VLMs). The company enables enterprises to seamlessly process and extract structured, schema-true JSON data from unstructured visual sources, including images, PDFs, and videos, designed for production-grade accuracy, security, and scalability.