We're hiring a senior software engineer to help build the largest case law dataset. Our data coverage includes US laws and court decisions and powers our lawyer-facing AI platform and B2B data services.
Responsibilities include:
Building pipelines that augment documents with metadata, e.g., which decisions overrule another decision, which decisions are an appeal/remand/consolidation of another decision, etc. Our competitors still label these by humans making $300+/hr.
Building systems to ensure the reliability and accuracy of hundreds of web scrapers.
Optimizing and evaluating our core utils, which do things like extracting and resolving citations, determining which courts are able to overrule which other courts, etc.
Exposing core services on our data via APIs, MCPs, websockets.
Benchmarking and evaluation of core tasks (human and synthetic).
We believe in skipping what can be skipped and appreciate simple solutions to complex problems.
Good candidates for this role should be (1) technical generalists, definitely across the backend (bonus for fullstack), and (2) comfortable working with data pipelines, including basic to intermediate infra/devops.
Interest/experience with stats/ML/AI is a bonus, but not critical. You should be cautiously AI-pilled.
Tech stack isn't critical, Python and SQL are core. Definitely be able to stand up your own projects on your preferred infra end-to-end.
Role is in-person at our office in SoHo, NYC. Competitive cash/benefits/salary.
Skills: Python, PostgreSQL, ElasticSearch, Playwright, GCP, Pinecone, Prefect, NeonDB.
Visa sponsorship available; relocation not available.
Skills Required
- Strong Python experience
- Proficiency with SQL and PostgreSQL
- Experience building and maintaining data pipelines
- Experience with web scraping and scraper reliability (e.g., Playwright)
- Familiarity with ElasticSearch
- Experience deploying services and basic to intermediate infra/devops (GCP or equivalent)
- Ability to build and expose services via APIs and websockets
- Experience with Prefect or orchestration tooling
- Familiarity with vector DBs or embeddings infrastructure (Pinecone)
- Familiarity with NeonDB or similar modern DB offerings
- Backend technical generalist comfortable owning projects end-to-end
- Interest/experience with statistics, ML, or AI
- Fullstack experience
- Able to work in-person in SoHo, NYC
What We Do
Midpage is an AI-powered legal research and drafting platform designed for litigators and law students. It streamlines the process of searching, reading, and analyzing US case law, statutes, and regulations using generative AI tools. By automating the transformation of research into briefs and memos and offering tools like grid-based search, Midpage helps legal professionals manage information overload and increase drafting efficiency.









