GenTube is a consumer AI creation platform built on a simple belief: creation should be entertainment.
Last year, people created 70M+ images on GenTube. What matters more is what’s emerging now: a small but growing group opens the app with no prompt, no goal, and stays for hours. No nudges. No incentives. That behavior is the signal we’re building around.
We’re an early, opinionated team based in Toronto, backed by top consumer AI investors and operators who’ve built at global scale.
Our ambition is straightforward and hard: build the next great consumer AI creation company for a billion people.
The RoleWe’re hiring a Product ML Engineer to build the intelligence layer of GenTube.
This is not a research-only role.
And not an infra-only role.
You’ll work at the intersection of models, systems, and product — shipping ML that real users feel every day. You’ll make explicit tradeoffs between speed, quality, cost, and delight — and measure them.
If you want ownership, rigor, and real-world scale, keep reading.
What You’ll DoCore ML InfrastructureBuild inference pipelines serving millions of generations per week.Core ML Infrastructure
Design real-time and streaming inference for diffusion models, LLMs, and multimodal systems.
Optimize latency across serving, batching, caching, routing, and model selection.
Adapt and productionize foundation models (SD, Flux, LLMs).
Implement quantization, distillation, pruning, and compilation.
Experiment with LoRAs, ControlNets, adapters for style, control, and personalization.
Build ranking, recommendation, and personalization systems.
Implement content understanding with embeddings, similarity search, clustering, classification.
Build moderation and safety systems that scale without killing creativity.
Scale GPU infrastructure from thousands to millions of daily generations.
Profile bottlenecks and optimize utilization and cost.
Run A/B tests on model variants; monitor quality, drift, and p99 latency.
Own reliability, observability, and graceful degradation.
Ship new model variants frequently.
Test speed vs. quality tradeoffs using real user behavior.
Close the loop: user behavior → signal → model improvement.
Build inference pipelines serving millions of generations per week.Core ML Infrastructure
Design real-time and streaming inference for diffusion models, LLMs, and multimodal systems.
Optimize latency across serving, batching, caching, routing, and model selection.
Adapt and productionize foundation models (SD, Flux, LLMs).
Implement quantization, distillation, pruning, and compilation.
Experiment with LoRAs, ControlNets, adapters for style, control, and personalization.
Build ranking, recommendation, and personalization systems.
Implement content understanding with embeddings, similarity search, clustering, classification.
Build moderation and safety systems that scale without killing creativity.
Scale GPU infrastructure from thousands to millions of daily generations.
Profile bottlenecks and optimize utilization and cost.
Run A/B tests on model variants; monitor quality, drift, and p99 latency.
Own reliability, observability, and graceful degradation.
Ship new model variants frequently.
Test speed vs. quality tradeoffs using real user behavior.
Close the loop: user behavior → signal → model improvement.
Founders have scaled consumer products to 100M+ users and led a $150M+ AI exit.
Backed by top consumer AI investors and operators.
We’re building the kind of company Canada rarely builds — consumer-first, global, culturally relevant.Small team. High bar. No bureaucracy.
A rag-tag group of pirates in the desert.
Location: Toronto (downtown). On-site.
Comp: Competitive salary + meaningful equity.
Benefits: Health, dental, vision, unlimited PTO, creative tools & education stipend.
Taste, curiosity, and ownership matter more than pedigree.
If you want to ship ML that millions of people feel, measure what works, and push the edge of consumer AI — we want to hear from you.
Apply by sending your application to [email protected]
Skills Required
- Build inference pipelines serving millions of generations per week.
- Design real-time and streaming inference for diffusion models, LLMs, and multimodal systems.
- Optimize latency across serving, batching, caching, routing, and model selection.
- Adapt and productionize foundation models (Stable Diffusion, Flux, LLMs).
- Implement quantization, distillation, pruning, and model compilation.
- Experiment with LoRAs, ControlNets, and adapters for style, control, and personalization.
- Build ranking, recommendation, and personalization systems.
- Implement content understanding with embeddings, similarity search, clustering, and classification.
- Build moderation and safety systems that scale without overly constraining creativity.
- Scale GPU infrastructure and optimize utilization and cost.
- Profile production bottlenecks and optimize throughput and p99 latency.
- Run A/B tests on model variants and monitor quality, drift, and p99 latency.
- Own reliability, observability, and graceful degradation for ML services.
- Ship new model variants frequently and iterate using user behavior signals.
What We Do
1851 Labs is a Toronto-based AI company that develops GenTube, a social AI art platform designed for fast, remixable image generation and collaborative creation. Their mission is to treat AI creation as entertainment, enabling users to instantly create and remix images with a focus on low-latency generation and intuitive, playful tools for a global community of creators.







