Modern Classrooms Project

AI Video Generation Architect

Reposted 2 Days Ago

Be an Early Applicant

Washington, DC, USA

In-Office

170K-200K Annually

Senior level

Edtech • Social Impact

The Role

Design and implement an end-to-end generative AI video pipeline that converts lesson specs into inspected, versioned videos. Build agentic orchestration, programmatic animation, TTS narration, evaluation/judging systems, symbolic verification of worked examples, and scalable Python/TypeScript media infrastructure.

Summary Generated by Built In

Start Date: ASAP

Role Type: Full-Time, Salaried

Background: Software development

Location: Remote, Flexible (USA based)

Salary: $170,000-$200,000 per year, plus benefits

Who We Are:

The Modern Classrooms Project (MCP) is a 501(c)(3) nonprofit organization that empowers educators to build classrooms that respond to every student’s needs. Founded by two award-winning teachers, we lead a movement of educators in implementing a self-paced, mastery-based instructional model that leverages technology to foster human connection, authentic learning, and social-emotional growth.

To date, we have reached over 100,000+ teachers through our free online course in 150+ countries. We are an ambitious, idealistic team led by former classroom teachers, and we are passionate about what we do.

Job Description - Why we need you!

Effective instructional videos make high-quality instruction accessible to all learners, regardless of experience or background. Every day, in classrooms around the world, Modern Classroom educators replace live lectures with instructional videos so that students can learn at their own paces, in school and/or at home. Good videos enhance learning — and they are time consuming to produce. A single high-quality lesson video can take hours to plan, record, edit, and caption.

We need an experienced, hands-on, AI-native engineer to build a brand new, state-of-the-art generative pipeline that turns specifications into high-quality instructional videos — complete with animations, synchronized AI narration, captions, and automated ground-truth quality verification. You will own the video render path end to end, from the canonical specification to the final rendered output, creating intuitive, powerful tools that will directly support educators and students every day.

Key Responsibilities
As our AI Video Generation Architect, you will be a senior individual contributor on our Engineering Team, reporting to the Head of Engineering and collaborating closely with the Chief Innovation Officer to ship features that make a real difference for students and educators.

You’ll be joining a small and growing team of talented software engineers working together to solve the problems teachers and students face every day. We’re building a world where every student can succeed, and we need you to help us make that happen.

You will:

Architect the video generation pipeline end to end. Design the gen-AI pipeline that transforms lesson specifications into storyboards, scene graphs, scripts, and production plans. Every stage emits deterministic lessons-as-code and structured intermediate artifacts — scene specs, asset manifests, timing maps — that can be inspected, versioned, cached, diffed, and selectively re-rendered.
Ship multiple substantial features per week. This is a minimum velocity bar, not an exaggeration. You will leverage AI and agentic coding to build incredible software, very, very quickly.
Build the multi-agent production workflow. Develop agentic orchestration (LangGraph or equivalent) in which an orchestrator delegates to specialist agents: pedagogy analyst, instructional scriptwriter, slide designer, animator, narrator, and a panel of graders, evaluators, and LLM judges, with structured outputs and human-in-the-loop labeling and fine-tuning.
Engineer the video generation pipeline. Build brand-consistent, design-system-driven video generation from structured content: layout engines and templates, LaTeX/KaTeX mathematical typesetting, programmatic diagrams and charts, and text-faithful image generation for illustrations with automated readability checks. Design programmatic motion to support worked examples with narration: kinetic typography, transitions, animated number lines and area models. Run parallelized rendering with generative video models (e.g. Veo / Kling / Seedance). Narration with TTS (e.g. ElevenLabs v3 / Gemini-TTS) audio tags and SSML, pronunciation lexicons for mathematical vocabulary, consistent voice identities across a course, multi-voice dialogue, multilingual narration, and open license music embeds.
Build the ground-truth quality system. Construct golden datasets of spec-to-video pairs annotated by educators. Implement rubric-based scoring with calibrated LLM- and VLM-as-judge evaluators: frame-level visual fidelity, verification of on-screen mathematics, A/V sync validation, pedagogical fidelity checks against the source spec's learning objectives, reading-level analysis, and K-12 content safety screens. Symbolically verify every worked example with a computable ground truth verification system — if the video teaches 3/4 + 1/8, a machine learning model should independently confirm the answer before any student sees it.
Architect resilient, high-scale media infrastructure. Design and scale the distributed backend across Python and TypeScript that carries the pipeline: render queues and job orchestration, transcoding and streaming (HLS), and provenance-aware metadata for AI-generated media. Own the systems design and ensure our foundational architecture is ready to scale.
Raise the bar for the team. Review the work of teammates and contractors. Collaborate with teammates on architecture and implementation reviews. Write PR comments, design docs, and agent skills that make the next person faster.

You should apply if:

You are AI-native. You are an expert in continuous multi-session development with Claude Code and/or OpenAI Codex. You are an expert at prompt engineering and context engineering. You write Agent Skills the way other engineers write unit tests. You practice Spec-Driven Development (GitHub Spec Kit or equivalent) as part of your normal workflow.
You have built real backend AI orchestration layers that run when you're not watching. You think in graphs — shared state flowing through nodes, conditional edges, interrupts, and circuit breakers. You have shipped non-trivial agentic pipelines using LangGraph, Python, and TypeScript, or equivalent. You treat durable execution, structured outputs, human-in-the-loop checkpoints, and provider-agnostic model routing as baseline design constraints. You have built evaluation harnesses, annotated datasets, and versioned prompt chains as first-class artifacts.
You are a programmatic media craftsperson. You have deep experience with a programmatic animation framework (e.g. Manim, Remotion / Motion Canvas) and strong FFmpeg fundamentals: codecs, containers, color, audio streams, muxing. You understand TTS model trade-offs, expressive direction with audio tags and SSML, pronunciation lexicons, forced alignment and word-level timestamps, and loudness standards. You can hear when the pacing is wrong for a twelve-year-old learner, and you fix it in the pipeline, not the waveform.
You treat quality as a measurable system. You build golden datasets and calibrated judges before you scale generation. You combine deterministic checks (schemas, layout constraints, symbolic math verification, A/V sync) with LLM- and VLM-as-judge evaluation validated against human labels. You catch the subtly wrong diagram, the mispronounced denominator, the worked example that's off by one — and the same eye applies to agent-generated code, which is plausible but not always right. You do not ship what you cannot measure.
You are self-directed. You thrive in small, high-autonomy teams and startups where the surface area is broad and the context shifts constantly. You write clearly. You own a problem end-to-end without waiting for a ticket to tell you what to do next.
You love to learn. You're actively leveraging the latest developments in AI and applying them to enhance both your own and others' work. You're also motivated by MCP's mission and vision, and eager to build teacher- and student-facing products.
You want to shape the world. You're motivated to be part of something larger than yourself. You believe that the highest value of your talent is using it to empower others. You're ready to make a real difference in educators' and young people's lives.

It would also be helpful if:

You have experience building edtech products.
You have experience handling sensitive and/or confidential data, particularly in an education context (COPPA, CIPA, FERPA, PPRA, SOC 2).

Compensation and Benefits

We aim to offer a competitive compensation package, as well as the opportunity to work in a fast-growing nonprofit that is on a mission to improve education worldwide. This includes:

Salaried position: $170,000-$200,000 gross salary per year
Employer-sponsored health insurance through CareFirst BlueCross BlueShield
Employer-sponsored dental and vision insurance through MetLife
Participation in Vanguard 403(b) deferred-compensation plan with 3% employer match
Paid Time Off, inclusive of: vacation/PTO (20 days), paid holidays, paid parental leave, sick and safe paid time off, "Me Days", and the ability to earn paid Comp time off
Annual budget for MCP-funded Continuous Learning for the program(s) you request (available after 6 months of continuous full-time employment)
FSA and Dependent Care FSA access
1x Salary Life Insurance company-paid coverage
Access to Wishbone Pet Insurance Benefit
Ability to work remotely and to set your own hours (within reason)

Skills Required

Proven AI-native engineering experience with agentic workflows and prompt/context engineering (Claude Code, OpenAI Codex or equivalent)
Experience building backend AI orchestration layers and multi-agent pipelines (LangGraph or equivalent) using Python and TypeScript
Experience with programmatic animation frameworks (e.g., Manim, Remotion, Motion Canvas)
Strong FFmpeg fundamentals (codecs, containers, audio streams, muxing, transcoding)
Experience with TTS systems, SSML, audio tags, pronunciation lexicons, forced alignment, multi-voice narration
Experience designing scalable media infrastructure: render queues, job orchestration, HLS streaming, provenance-aware metadata
Ability to build evaluation harnesses, annotated datasets, rubric-based scoring and LLM/VLM judge pipelines
Experience with LaTeX/KaTeX typesetting and programmatic diagrams/charts for educational content
Experience with spec-driven development, versioned prompt chains, durable execution, and human-in-the-loop checkpoints
Ability to design symbolic/computable verification for worked examples and deterministic checks for generated content
Strong written communication, ability to write design docs, PR comments, and agent skills
Background in software development
Experience building edtech products
Experience handling education-sensitive/confidential data and compliance (COPPA, CIPA, FERPA, PPRA, SOC 2)

View all jobs at Modern Classrooms Project

View Modern Classrooms Project Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

Year Founded: 2018

What We Do

Modern Classrooms Project is a nonprofit organization that empowers educators to meet every student's needs through a research-backed instructional model utilizing blended, self-paced, and mastery-based learning. The organization equips teachers with classroom-tested tools and techniques to foster human connection, authentic learning, and social-emotional growth, ensuring that every student is appropriately challenged and supported regardless of their individual background, ability, or attendance.