This role is for one of our portfolio companies, not internally at Airtree. Your application will be reviewed by the founder, not an Airtree employee.
This early stage company, still operating in stealth, is on a mission to eradicate cost overruns in construction, a $3 trillion problem that slows down cities, destroys margins, and erodes trust between builders and clients.
Over 90% of construction projects go over budget, eroding builder margins and stalling progress across the industry. Our platform takes a new approach to helping builders stay on budget by detecting and preventing costly variations before they spiral. Our AI-powered early warning system gives builders control over project costs — protecting time, margin, and reputation on every build.
We are at the start of something big and we’re looking for an Applied AI Engineer (Eval-driven) to build and ship design-audit workflows that consistently meet measurable quality bars. This role blends ML engineering and data science, with a heavy emphasis on problem definition, evaluation, and reliability in real customer workflows.
- Define evaluation problems: success criteria, failure modes, datasets, labelling guidelines, and score functions.
- Build and maintain an evaluation harness: regression tests, edge-case suites, and quality dashboards to prevent backsliding.
- Implement workflow systems end-to-end (data → model/LLM components → post-processing → acceptance testing) until they pass eval thresholds.
- Partner with product and domain stakeholders to translate messy real-world requirements into testable specs.
Requirements
- Strong Python skills and practical experience shipping ML/AI systems (not just experimentation).
- Demonstrated experience designing evals for ML/LLM systems (offline metrics, gold sets, error analysis, regression testing, monitoring).
- Comfort working across data science + engineering tasks: data wrangling, feature/label design, model/LLM iteration, and productionization.
- High ownership and intensity: persistence in closing the loop from “fails eval” to “passes consistently.”
- Experience with document understanding (OCR, parsing, classification/extraction) and structured outputs (schemas, validators).
- Familiarity with AEC/construction workflows (design coordination, QA/compliance, BIM concepts like IFC/Revit).
- Experience building human-in-the-loop review systems and adjudication processes to improve training/eval data.
Top Skills
What We Do
Airtree is a venture capital firm backing Aussie and Kiwi founders, building the iconic technology companies of tomorrow.
We’re powered by our network, dedicating extraordinary resources to help founders shortcut company-building firsts and accelerate their journey from idea to global household name.
As one of the largest and most active early-stage investors in Australia and New Zealand, Airtree’s 100+ portfolio features the region’s breakout tech companies, including Canva, Go1, Employment Hero, Pet Circle, Immutable and Linktree.







