VLM Run

Founding AI/ML Engineer

Reposted 3 Days Ago

Santa Clara, CA, USA

Hybrid

150K-220K Annually

Mid level

Artificial Intelligence • Computer Vision • Machine Learning • Software

The Role

Build and scale the infrastructure layer for visual intelligence: optimize VLM inference and GPU serving, design multimodal APIs and structured outputs, ensure scalable backend systems, reliability, observability, CI/CD, and developer experience. Drive 0→1 product work with strong testing discipline, schema validation, and performance optimization.

Summary Generated by Built In

Join us as we build VLM Run – the enterprise infrastructure layer for visual intelligence. Our mission is to give developers a unified way to fine-tune, specialize, and run Vision-Language Models (VLMs) that turn images, PDFs, screenshots, and video into reliable, schema-true structured data for production insights and automation – built for scale, security, and SLAs.

We’re looking for exceptional engineers to help us build and scale the infrastructure layer for visual intelligence. You’ll do well here if you bring strong technical craft, high ownership, and strength in one or more of these areas:

Platform & Infra: Own and optimize the VLM inference stack (see Orion) end-to-end – from GPU serving and latency/cost to scalable backend systems and reliability.
Developer Experience: Design clean, ergonomic APIs for multimodal apps – tool/function calling, structured outputs, and workflows developers actually want to build.
High Agency + Velocity: We move fast on hard problems. You’ll take ideas from 0→1, set the bar for quality, and help define what “production-grade visual intelligence” looks like.

🎓 Required Expertise (BS & 4+ YoE)

LLM Experience: Integrated or built applications with LLMs (OpenAI, HuggingFace, Ollama, vLLM), with an understanding of prompt engineering, function calling, and structured outputs.
Backend Engineering: Python, FastAPI, async API design, schema validation, caching, and performance optimization.
Infra & DevOps: Docker, Kubernetes, CI/CD, observability (logging, metrics, tracing), GCP or AWS.
Datastores & Systems: Postgres, MongoDB, Redis; experience with scalable, reliable data pipelines.
Developer Experience: Strong testing discipline (TDD), clean code, GitHub workflows (PRs, reviews, CI), and internal tooling mindset.
[BONUS] SaaS Experience: Shipped full-stack dev platforms or SaaS products – from landing pages to auth, billing, telemetry, and infra. Email us with a one-liner if you've done this.

🗒️ Other Details

Pay + Equity Range: $150K-$220K / yr, 0.5-3% equity
Competitive compensation and benefits: We pay market rate for seed-stage startups + equity options, offer great healthcare and 401K.
In-person: At least 4 days a week in Santa Clara, CA (we’re right by 101, next to AMD’s HQ offices).

Skills Required

BS degree and 4+ years of experience
Experience integrating or building applications with LLMs (OpenAI, HuggingFace, Ollama, vLLM); prompt engineering and function calling knowledge
Backend engineering with Python, FastAPI, async API design, schema validation, caching, and performance optimization
Infrastructure and DevOps experience: Docker, Kubernetes, CI/CD, observability (logging, metrics, tracing), GCP or AWS
Datastores and systems experience: Postgres, MongoDB, Redis, scalable reliable data pipelines
Developer experience practices: TDD, clean code, GitHub workflows (PRs, reviews, CI), internal tooling mindset
SaaS experience (shipping full-stack platforms, auth, billing, telemetry, infra)

View all jobs at VLM Run

View VLM Run Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

6 Employees

Year Founded: 2022

What We Do

VLM Run is an enterprise infrastructure platform for visual intelligence, providing a unified API to fine-tune, specialize, and operationalize Vision Language Models (VLMs). The company enables enterprises to seamlessly process and extract structured, schema-true JSON data from unstructured visual sources, including images, PDFs, and videos, designed for production-grade accuracy, security, and scalability.