Twin – Research Engineering Internship
Core Platform · Paris / Hybrid / Remote (Europe)
About Twin
Twin builds autonomous agents that are reliable and scalable enough for companies to actually delegate work to them. We engineered the first commercial agent deployed to 500k SMBs — Invoice Operator — achieving 95%+ accuracy at a fraction of the human cost. Over the coming months, we expect this first agent alone to automate millions of man-hours.
To achieve this, we've built, among other things:
A Rust-based browser infrastructure, maintaining low latency at high scale and concurrency
A password manager enabling secure and automated agent authentication, including solving 2FA
A graph-based agent framework, making complex agents robust through modularity and decoupling
A self-correcting infrastructure, allowing agents to learn from mistakes and continuously improve as they encounter new challenges
Founded in 2024, we've raised €12M from LocalGlobe and the founders of companies like Hugging Face, Datadog, and Alan. Our ambition: launch and distribute the next generation of high-impact agents, and become the trusted layer where autonomous work runs.
About the Core Platform team
The Core Platform team builds the foundational infrastructure powering Twin's high-performance, Rust-based agent engine. This engine manages browser automation, communication between agents and humans, and the evaluation and testing pipelines that keep our agents best-in-class.
The team's objective is to continuously improve speed, cost, and reliability at scale — and lay the groundwork for training our own advanced models that will outperform the current state of the art.
About the position
This is first and foremost a software engineering role — but one with a clear long-term target: the systems you build here will directly enable us to train our own world-class browsing agents, surpassing current industry benchmarks in both performance and reliability.
In the short term, you'll work on the engineering foundations: better data pipelines, smarter evaluation frameworks, and reliability tooling. In the long term, that work will fuel the datasets, metrics, and insights that make our models the best in the world.
Main challenges
Build sophisticated Rust-based data filtering & evaluation systems to identify the highest-quality agent interactions for training
Develop large-scale pipelines to collect, process, and analyze browsing behavior in real time
Create evaluation frameworks to measure and compare agent decision quality — and feed those insights back into model training
Solve complex reliability and performance issues with elegant engineering approaches that translate into better datasets and smarter agents
Requirements
Must-have
Strong software engineering skills and experience shipping reliable code
Creative problem-solving mindset — knowing when engineering beats brute-force model training
Comfortable in lower-level languages (Rust experience a plus)
Interest in AI agents and belief that great models start with great systems
Ability to work fast and autonomously
Based in Paris or willing to spend your first month+ in our office (remote/hybrid possible after onboarding)
Nice to have
Experience with Rust (any level)
Previous production experience in any technology stack
Understanding of ML concepts (for future model training context)
Experience in data processing, evaluation, or testing infrastructure
No production experience yet? Show us you're exceptional through personal projects, open-source contributions, competitive programming, or other evidence of your ability to solve complex problems with speed and elegance.
Top Skills
What We Do
Twin Labs is a French AI lab backed by Clem Delangue & Thomas Wolf (HuggingFace), Irwan Bello & Romain Huet (OpenAI), Charles Gorintin (Alan, Mistral), Mehdi Ghissassi & Romuald Elie (Deepmind), Yanda Erlich (Weight & Biases), Mathias Gallé (Cohere), and many others.
We’re building the first AI that can act as Humans do. Launching soon.








