We're hiring a senior software engineer to help build the largest case law dataset. Our data coverage includes US laws and court decisions and powers our lawyer-facing AI platform and B2B data services.
Responsibilities include:
Building pipelines that augment documents with metadata, e.g., which decisions overrule another decision, which decisions are an appeal/remand/consolidation of another decision, etc. Our competitors still label these by humans making $300+/hr.
Building systems to ensure the reliability and accuracy of hundreds of web scrapers.
Optimizing and evaluating our core utils, which do things like extracting and resolving citations, determining which courts are able to overrule which other courts, etc.
Exposing core services on our data via APIs, MCPs, websockets.
Benchmarking and evaluation of core tasks (human and synthetic).
We believe in skipping what can be skipped and appreciate simple solutions to complex problems.
Good candidates for this role should be (1) technical generalists, definitely across the backend (bonus for fullstack), and (2) comfortable working with data pipelines, including basic to intermediate infra/devops.
Interest/experience with stats/ML/AI is a bonus, but not critical. You should be cautiously AI-pilled.
Tech stack isn't critical, Python and SQL are core. Definitely be able to stand up your own projects on your preferred infra end-to-end.
This is a remote role. Additional compensation offered for relocation to NYC.
Skills: Python, PostgreSQL, ElasticSearch, Playwright, GCP, Pinecone, Prefect, NeonDB.
Visa sponsorship is not available.
Skills Required
- Senior software engineering experience (backend-focused)
- Proficiency in Python
- Proficiency in SQL / PostgreSQL
- Experience building and maintaining large-scale web scrapers and ensuring reliability
- Experience building data pipelines and augmenting documents with metadata
- Ability to expose services via APIs and WebSockets
- Basic to intermediate infrastructure / DevOps experience; able to stand up end-to-end projects on preferred infra
- Experience with Elasticsearch
- Experience with Playwright (or equivalent browser automation)
- Experience with GCP
- Experience with Pinecone
- Experience with Prefect (or equivalent workflow orchestration)
- Experience with NeonDB (Postgres-compatible)
- Fullstack experience
- Interest or experience with statistics / ML / AI
What We Do
Midpage is an AI-powered legal research and drafting platform designed for litigators and law students. It streamlines the process of searching, reading, and analyzing US case law, statutes, and regulations using generative AI tools. By automating the transformation of research into briefs and memos and offering tools like grid-based search, Midpage helps legal professionals manage information overload and increase drafting efficiency.

.png)






