Machine Learning Engineer

Posted Yesterday
Be an Early Applicant
Toronto, ON, CAN
In-Office
Senior level
Artificial Intelligence • Consumer Web • Social Media • Generative AI
The Role
Build and productionize real-time ML systems for consumer AI: serve and optimize large-scale diffusion and LLM inference, improve model performance (quantization, distillation, pruning), implement personalization, ranking, and moderation, scale GPU infrastructure, run experiments and A/B tests, and own reliability, observability, and cost-performance tradeoffs.
Summary Generated by Built In
About GenTube

GenTube is a consumer AI creation platform built on a simple belief: creation should be entertainment.

Last year, people created 70M+ images on GenTube. What matters more is what’s emerging now: a small but growing group opens the app with no prompt, no goal, and stays for hours. No nudges. No incentives. That behavior is the signal we’re building around.

We’re an early, opinionated team based in Toronto, backed by top consumer AI investors and operators who’ve built at global scale.

Our ambition is straightforward and hard: build the next great consumer AI creation company for a billion people.

The Role

We’re hiring a Product ML Engineer to build the intelligence layer of GenTube.

This is not a research-only role.

And not an infra-only role.

You’ll work at the intersection of models, systems, and product — shipping ML that real users feel every day. You’ll make explicit tradeoffs between speed, quality, cost, and delight — and measure them.

If you want ownership, rigor, and real-world scale, keep reading.

What You’ll DoCore ML Infrastructure
  • Build inference pipelines serving millions of generations per week.Core ML Infrastructure

  • Design real-time and streaming inference for diffusion models, LLMs, and multimodal systems.

  • Optimize latency across serving, batching, caching, routing, and model selection.

Model Performance
  • Adapt and productionize foundation models (SD, Flux, LLMs).

  • Implement quantization, distillation, pruning, and compilation.

  • Experiment with LoRAs, ControlNets, adapters for style, control, and personalization.

Intelligence Layers
  • Build ranking, recommendation, and personalization systems.

  • Implement content understanding with embeddings, similarity search, clustering, classification.

  • Build moderation and safety systems that scale without killing creativity.

Production Systems
  • Scale GPU infrastructure from thousands to millions of daily generations.

  • Profile bottlenecks and optimize utilization and cost.

  • Run A/B tests on model variants; monitor quality, drift, and p99 latency.

  • Own reliability, observability, and graceful degradation.

Relentless Experimentation
  • Ship new model variants frequently.

  • Test speed vs. quality tradeoffs using real user behavior.

  • Close the loop: user behavior → signal → model improvement.

What We’re Looking ForCore ML Infrastructure
  • Build inference pipelines serving millions of generations per week.Core ML Infrastructure

  • Design real-time and streaming inference for diffusion models, LLMs, and multimodal systems.

  • Optimize latency across serving, batching, caching, routing, and model selection.

Model Performance
  • Adapt and productionize foundation models (SD, Flux, LLMs).

  • Implement quantization, distillation, pruning, and compilation.

  • Experiment with LoRAs, ControlNets, adapters for style, control, and personalization.

Intelligence Layers
  • Build ranking, recommendation, and personalization systems.

  • Implement content understanding with embeddings, similarity search, clustering, classification.

  • Build moderation and safety systems that scale without killing creativity.

Production Systems
  • Scale GPU infrastructure from thousands to millions of daily generations.

  • Profile bottlenecks and optimize utilization and cost.

  • Run A/B tests on model variants; monitor quality, drift, and p99 latency.

  • Own reliability, observability, and graceful degradation.

Relentless Experimentation
  • Ship new model variants frequently.

  • Test speed vs. quality tradeoffs using real user behavior.

  • Close the loop: user behavior → signal → model improvement.

Why Join
  • Founders have scaled consumer products to 100M+ users and led a $150M+ AI exit.

  • Backed by top consumer AI investors and operators.
    We’re building the kind of company Canada rarely builds — consumer-first, global, culturally relevant.

  • Small team. High bar. No bureaucracy.

  • A rag-tag group of pirates in the desert.

Location: Toronto (downtown). On-site.

Comp: Competitive salary + meaningful equity.

Benefits: Health, dental, vision, unlimited PTO, creative tools & education stipend.

Taste, curiosity, and ownership matter more than pedigree.

If you want to ship ML that millions of people feel, measure what works, and push the edge of consumer AI — we want to hear from you.

Apply by sending your application to [email protected]

Skills Required

  • Build inference pipelines serving millions of generations per week.
  • Design real-time and streaming inference for diffusion models, LLMs, and multimodal systems.
  • Optimize latency across serving, batching, caching, routing, and model selection.
  • Adapt and productionize foundation models (Stable Diffusion, Flux, LLMs).
  • Implement quantization, distillation, pruning, and model compilation.
  • Experiment with LoRAs, ControlNets, and adapters for style, control, and personalization.
  • Build ranking, recommendation, and personalization systems.
  • Implement content understanding with embeddings, similarity search, clustering, and classification.
  • Build moderation and safety systems that scale without overly constraining creativity.
  • Scale GPU infrastructure and optimize utilization and cost.
  • Profile production bottlenecks and optimize throughput and p99 latency.
  • Run A/B tests on model variants and monitor quality, drift, and p99 latency.
  • Own reliability, observability, and graceful degradation for ML services.
  • Ship new model variants frequently and iterate using user behavior signals.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
4 Employees
Year Founded: 2024

What We Do

1851 Labs is a Toronto-based AI company that develops GenTube, a social AI art platform designed for fast, remixable image generation and collaborative creation. Their mission is to treat AI creation as entertainment, enabling users to instantly create and remix images with a focus on low-latency generation and intuitive, playful tools for a global community of creators.

Similar Jobs

Block Logo Block

Machine Learning Engineer

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
In-Office or Remote
8 Locations
12000 Employees
277K-415K Annually

Block Logo Block

Machine Learning Engineer

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
In-Office or Remote
8 Locations
12000 Employees
277K-415K Annually

Cash App Logo Cash App

Machine Learning Engineer

Blockchain • Fintech • Mobile • Payments • Software • Financial Services
Remote or Hybrid
8 Locations
3500 Employees
277K-415K Annually

Cash App Logo Cash App

Machine Learning Engineer

Blockchain • Fintech • Mobile • Payments • Software • Financial Services
Remote or Hybrid
8 Locations
3500 Employees
277K-415K Annually

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
LTX Thumbnail
Conversational AI • Generative AI
Jerusalem, Israel
360 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account