The Role
The role involves creative web scraping, large-scale parsing or OCR pipeline experience, and requires exceptional attention to detail.
Summary Generated by Built In
Calaveras AI shipped the best code pretraining dataset on the market to multiple major AI companies. Our customers include 4 of the top 6 AI companies on ChatArena.
You may be a good fit for this role if you have two or more of the following traits:
- Performed creative or difficult web scraping;
- Have experience with large-scale parsing or OCR pipelines;
- Have exceptional attention to detail.
Compensation and benefits:
- Base salary of $150k-$500k+
- Substantial equity in addition to base salary and bonus
- Substantial performance bonus in addition to base salary and equity
- Visa sponsorship and relocation support
Skills Required
- Experience with web scraping
- Experience with large-scale parsing or OCR
- Exceptional attention to detail
Am I A Good Fit?
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.
Success! Refresh the page to see how your skills align with this role.
The Company
What We Do
Calaveras offers: Over 100 B tokens of novel pretraining code data, sold to 4 frontier AI labs. Rapid turnaround trillion-token scale custom pretraining data procurement for code and non-code white collar tasks. Robustly tested RL environments for coding capabilities. Custom RL environment development, with a focus on core coding capabilities.







