AI Engineer (Vision)

Posted Yesterday
Be an Early Applicant
Paris, Île-de-France, FRA
Hybrid
100K-250K Annually
Mid level
Artificial Intelligence • Security • Software • Cybersecurity
The Role
Train and fine-tune vision-language models and extend them to video with temporal and long-context modeling. Build multimodal alignment pipelines (GRPO, DPO, reward modeling), curate large-scale datasets (including synthetic), develop evaluation benchmarks, optimize and deploy models for production (quantization, batching, latency), and work with MoE architectures and distributed training.
Summary Generated by Built In

TLDR: You’ll train and fine-tune vision-language models, extend them to video, build alignment pipelines (GRPO, DPO, reward modeling), develop evaluation benchmarks, optimize inference for production, and work with MoE architectures.

About us

White Circle is an AI Safety company building the safety, reliability, and optimization layer for AI systems. At the core of our platform are policies – simple natural-language rules that define what an AI model should and shouldn’t do. We automatically test, enforce, and continuously improve these policies at scale.

  • We’ve raised $11M from top funds, founders, and senior leaders at OpenAI, Anthropic, HuggingFace, Mistral, DeepMind, Datadog, Sentry, and others

  • We process over one hundred million API calls every month

  • We fine-tune and train our own LLMs so they run faster and cheaper than any open or proprietary model

We’re a small, highly focused team. If you want to work deeply on hard problems, see your work ship to production quickly, and influence how AI safety is actually built – you’re the one we need.

You will:
  • Train vision-language models from scratch and fine-tune existing architectures for image understanding

  • Extend VLM capabilities to video: design temporal modeling approaches, handle long-context efficiently

  • Design evaluation benchmarks that matter: visual QA, spatial reasoning, video comprehension

  • Curate and maintain multimodal datasets — including synthetic data generation pipelines

  • Train and optimize MoE architectures for efficient multimodal inference

  • Deploy models to production: quantization, batching strategies, latency optimization


You’ll fit right in if you:
  • 3+ years training and fine-tuning vision-language models (LLaVA, Qwen-VL, InternVL, or similar)

  • Deep experience with multimodal architectures — you understand how vision encoders, projectors, and LLMs fit together

  • Hands-on with RLHF/alignment for multimodal: GRPO, DPO, reward modeling — not just for text

  • Experience with video understanding: temporal modeling, long-context processing, efficient attention mechanisms

  • Track record shipping VLMs to production: you've optimized inference, not just reported benchmark scores

  • Comfortable with large-scale dataset curation: image-text pairs, video-instruction data, synthetic data generation

  • Familiar with MoE architectures and their tradeoffs for multimodal workloads

  • Strong PyTorch skills, experience with distributed training (DeepSpeed, FSDP)

Why White Circle
  • Salary of $100,000 to $250,000 + equity

  • Paid time off in line with your local regulations, no matter where you work from

  • Work from Paris (hybrid) + relocation package

  • Best medical insurance in France

  • All the hardware, tools, and services you need

  • Covered subscriptions for AI agents and IDEs

  • Team off-sites twice a year: we’ve recently been to the Alps and to Saint-Tropez

How we hire
  1. Intro call with one of our colleagues

  2. Сomplete the take-home assignment

  3. Show your best during the technical interview

  4. Final call with our CEO and CTO

Please submit your application in English - it’s our company language so you’ll be speaking lots of it if you join

Skills Required

  • 3+ years training and fine-tuning vision-language models (e.g., LLaVA, Qwen-VL, InternVL or similar)
  • Deep experience with multimodal architectures (vision encoders, projectors, LLM integration)
  • Hands-on experience with RLHF/alignment for multimodal systems (GRPO, DPO, reward modeling)
  • Experience with video understanding: temporal modeling and long-context processing
  • Track record shipping VLMs to production, including inference optimization and latency tuning
  • Experience curating and managing large-scale multimodal datasets, including synthetic data generation
  • Familiarity with MoE (Mixture of Experts) architectures and their tradeoffs for multimodal workloads
  • Strong PyTorch skills and experience with distributed training frameworks (DeepSpeed, FSDP)
  • Practical knowledge of quantization, batching strategies, and other production inference optimizations
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
23 Employees
Year Founded: 2025

What We Do

White Circle is an enterprise AI control platform specializing in automated vulnerability detection and protection for AI systems. The company provides a unified system for testing, monitoring, and safeguarding AI applications in real time, focusing on blocking unsafe inputs, preventing jailbreaks, and optimizing model performance. Its mission is to secure AI systems and ensure they remain safe and controllable for businesses worldwide.

Similar Jobs

Adyen Logo Adyen

Enterprise Account Manager

Fintech • Payments • Financial Services
Easy Apply
Hybrid
Paris, Île-de-France, FRA
4771 Employees

SharkNinja Logo SharkNinja

Account Manager

Beauty • Robotics • Design • Appliances • Manufacturing
Hybrid
Paris, Île-de-France, FRA
4000 Employees

Snap Inc. Logo Snap Inc.

Software Engineer

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Hybrid
Paris, Île-de-France, FRA
5000 Employees

Snap Inc. Logo Snap Inc.

Senior Indy Agency Partner

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Hybrid
Paris, Île-de-France, FRA
5000 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account