Exa is building a search engine from scratch to serve every AI application. We build massive-scale infrastructure to crawl the web, train state-of-the-art embedding models to index it, and develop super high performant vector databases in rust to search over it. We also own a $5m H200 GPU cluster and routinely run batchjobs with 10s of thousands of machines. This isn't your average startup :)
As a Web Crawler engineer, you'd be responsible for crawling the entire web. Basically build Google-scale crawling!
Desired ExperienceYou have extensive experience building and scaling web crawlers, or would be excited to ramp up very quickly
You have experience with some high performance language (C++, rust, etc.)
You’re comfortable optimizing the crap out of a system
You care about the problem of finding high quality knowledge and recognize how important this is for the world
Build a distributed crawler that can handle 100M+ pages per day
Optimize crawl politeness and rate limiting across thousands of domains
Design systems to detect and handle dynamic content, JavaScript rendering, and anti-bot measures
Create intelligent crawl scheduling and prioritization algorithms for maximum coverage efficiency
This is an in-person opportunity in San Francisco. We're happy to sponsor international candidates (e.g., STEM OPT, OPT, H1B, O1, E3).
Top Skills
What We Do
Exa was built with a simple goal — to organize all knowledge. After several years of heads-down research, we developed novel representation learning techniques and crawling infrastructure so that LLMs can intelligently find relevant information.







