Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes up 80% of internet traffic and has become the enabling digital medium powering creativity, communication, gaming, AR/VR, and robotics. Sieve exists to solve the biggest bottleneck in growth of these applications: high-quality training data.
Sieve scaled from 0 to $XXM in revenue in the second half of 2025, with a relatively small team of 12 people. We also recently raised our Series A from Tier 1 firms such as Matrix Partners, Swift Ventures, Y Combinator, and AI Grant.
About the RoleAs an applied research engineer at Sieve, you’ll build high performance building blocks and large scale pipelines to understand video with high precision at internet scale. Often this involves working on ambiguous research problems and finding clever techniques to solve them. You will be working in the computer vision, audio processing, and text processing domains.
You’re likely a good fit if you’re comfortable working with models + APIs and squeezing every drop of performance out of them through clever pre/post-processing, parallelism, pipelining, inference optimization, and occasionally fine-tuning.
Requirements2+ years of experience in computer vision or audio processing
Strong Python developer with hands-on experience in PyTorch or similar ML frameworks
Excellent communication skills, especially with customers and external teams
Writes clean, maintainable code—bonus points for active GitHub or portfolio projects
Deep passion for the video domain and media technologies
Motivated by building end-to-end products—not just training models
Able to break problems down from customer level impact to necessary building blocks.
Bonus: Active contributor to open source projects
Bonus: Experience as an early hire at a startup
In-person at our SF HQ
Top Skills
What We Do
Sieve is the only AI research lab exclusively focused on video data.
Video already makes up 80% of internet traffic and has become the dominant medium driving creativity, communication, gaming, AR/VR, and robotics. Unlocking the ability to truly model video is the key to breakthroughs across all of these domains but progress has been bottlenecked by one thing: high-quality training data. That’s where Sieve comes in.
We bring together exabyte-scale video infrastructure, novel video understanding techniques, and dozens of diverse data sources to create datasets that push the frontier of video modeling. This unique combination allows us to deliver data with unmatched precision, quality, and speed which has earned the trust of frontier AI labs, Fortune 100 companies, and fast-growing generative AI startups.