Software Engineer (Large Scale Training)

Posted 5 Hours Ago
Be an Early Applicant
Hiring Remotely in Jerusalem, ISR
Remote or Hybrid
Junior
Conversational AI • Generative AI
Open Creative Intelligence for Everyone Who Imagines and Builds.
The Role
As a Software Engineer in the ML team, you will develop and maintain large-scale model training systems, ensuring performance and user experience for researchers.
Summary Generated by Built In
Who we are

Lightricks is an AI-first company creating next-generation content creation technology for businesses, enterprises, and studios with a mission to bridge the gap between imagination and creation. At our core is LTX-2, an open-source generative video model, built to deliver expressive, high-fidelity video at unmatched speed. It powers both our own products and a growing ecosystem of partners through API integration.

The company is also known globally for pioneering consumer creativity through products like Facetune, one of the world's most recognized creative brands, which helped introduce AI-powered visual expression to hundreds of millions of users worldwide. We combine deep research, user-first design, and end-to-end execution from concept to final render to bring the future of expression to all.

About the Role

This is a software engineering role on an ML team. You'll own the systems that make large-scale model training fast, reliable, and pleasant to work with, the distributed training framework, the data pipelines feeding it, the performance characteristics of every step on the critical path, and the day-to-day developer experience for the researchers who depend on it.

You don't need to come in as an ML expert. You do need to be a strong engineer who gets excited about hard systems problems: squeezing throughput out of accelerator clusters, hunting down stragglers across hundreds of machines, designing abstractions that hold up as the codebase grows, and making the unglamorous parts of training infrastructure work well.

If you've ever looked at a large-scale system and thought "there's no reason this should take this slow / inefficient / hard to maintain / complex," this role is built for you.

Key Responsibilities
  • Build and maintain the distributed training framework: orchestration, checkpointing, fault tolerance, observability, and the ergonomics researchers interact with daily.
  • Profile end-to-end training runs and eliminate bottlenecks wherever they live- compute, memory, interconnect, storage, or the data pipeline.
  • Collaborate with researchers to translate model ideas into training code that runs efficiently, and flag when an architectural choice will be expensive before it ships.
  • Own a shared codebase the team relies on: correctness, readability, testing, and long-term maintainability matter as much as the benchmark numbers.
  • Work close to the metal where it pays off- write or integrate custom GPU kernels, tune collective communication, and exploit hardware features that off-the-shelf frameworks leave on the table.
Your skills and experience
  • 2+ years of professional software engineering experience, ideally including work on performance-sensitive or distributed systems.
  • Strong software engineering fundamentals. You write clean, tested, maintainable Python, and you're comfortable reading and writing modern C++.
  • Real experience with performance work- profiling, optimization, and reasoning about systems where latency, throughput, and resource contention actually matter.
  • Comfort with distributed systems: you've debugged things that only break at scale and have intuitions for where they tend to go wrong.
  • A bias toward understanding systems end-to-end rather than treating any layer as a black box.
  • Familiarity with Kubernetes or similar environments for running and scaling large workloads.

* ML training experience is a bonus. If you have it, great, but we'd rather hire a strong systems engineer who's curious about ML than an ML engineer who's lukewarm about infrastructure.

Nice to have

  • Working knowledge of at least one accelerator architecture (GPU, TPU, or similar), or a clear track record of going deep on hardware when the problem calls for it.
  • Experience with JAX/Pallas, Triton, CUDA, OpenCL, Metal, or similar accelerator programming.
  • Prior exposure to ML training pipelines, even informally- pet projects count.
Why Join Us

We’re here to push the boundaries of what’s possible with AI and video - not for the buzz, but for the craft, the challenge, and the chance to make something genuinely new.
We believe in an environment where people are encouraged to think, create, and explore. Real impact happens when people are empowered to experiment, evolve, and elevate together. At Lightricks, every breakthrough starts with great people and a collaborative mindset. If you're looking for a place that combines deep tech, creative energy, and zero buzzword culture, you might be in the right place.

We got you covered: 
  • We run daily door-to-door shuttles, offering Car-to-go subscriptions for several locations in central Israel, plus free parking and train-station pickups.
  • We’re proud to have 2 chef-led restaurants on site by the legendary Machneyuda Group (yes, that Machneyuda!), plus a bakery nestled in the heart of our office, filled daily with the scent of fresh pastries.
  • We empower employees with cutting-edge tools and learning opportunities to grow and succeed through workshops, access and training on platforms, subscriptions, and clear guidelines for responsible AI use.

Skills Required

  • 2+ years of professional software engineering experience
  • Strong software engineering fundamentals in Python and C++
  • Experience with performance-sensitive or distributed systems
  • Understanding of debugging in distributed systems

LTX Compensation & Benefits Highlights

  • Healthcare Strength Medical coverage includes multiple plan choices (PPO, HDHP with HSA funding, and an in‑network option) with prescription, dental (including adult orthodontia), and vision, plus access to some onsite or virtual health centers. Feedback suggests the overall package is broad and competitive for the space.
  • Retirement Support The 401(k) design combines an automatic employer contribution with a matching contribution after eligibility, alongside HSA/FSA options and a small company seed for healthcare spending. This structure provides steady, predictable retirement backing.
  • Parental & Family Support New Parent Pay provides up to nine weeks at full salary for birth, adoption, surrogacy, or foster placement, with additional support like backup child/elder care, tutoring discounts, and family‑building assistance up to $20,000. These programs indicate a strong emphasis on family support alongside flexible and volunteer time off.

LTX Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Jerusalem, Israel
360 Employees

What We Do

At our core is LTX- an open-source generative video model built to deliver expressive, high-fidelity video at unmatched speed. It powers our own products and a growing ecosystem of creative partners worldwide. We're one of the few companies globally building our own multimodal foundation models, competing directly with the world's largest AI labs. Startup speed. Experienced leadership. A product people already love. Backed by a proven business model and unicorn-scale resources. We were founded by four computer-science PhDs who never stopped thinking like researchers. That mindset still defines us: deep technical ambition, a bias toward building, and a belief that the best creative tools haven't been invented yet. The team has done it before. Lightricks' suite of apps, including the famous Facetune, has over 500 million downloads worldwide and has won numerous prestigious awards, including Apple's App of the Year, the Apple Design Award and both Apple and Google Play's Best of the Year. What makes us "us" - At LTX, innovation starts with people, not buzzwords. We move fast, speak plainly, and take real ownership. Impact here comes from passion, curiosity, and initiative, and from raising the bar for yourself and the people around you. We believe the best work happens together. The culture here is genuinely collaborative, people share knowledge, back each other up, and win as a team. When things click, everyone feels it. You'll have what you need to do your best work: unrestricted access to compute, tools, and resources. We're AI-native in practice- first access to the latest models, real infrastructure, and a team that learns together in real time. We move at the pace of AI, not just in what we ship, but in how we think and work. Priorities shift, the market moves, and we move with it. If you thrive on change, adapt quickly, and want to be part of a team that keeps up, you'll feel right at home. We build, break, iterate, repeat. What matters here is how you think, what you're curious about, and where you want to go.

LTX Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

We work hybrid. There are set in-office days built into the week, and the rest is remote. We design the mix intentionally, because in-person time makes the work better.

Typical time on-site: 0 days a week
HQJerusalem, Israel
Chicago, Illinois
New York, New York
Learn more

Similar Jobs

LTX Logo LTX

Scientist

Conversational AI • Generative AI
Remote or Hybrid
Jerusalem, ISR
360 Employees

LTX Logo LTX

Senior Strategic FP&A

Conversational AI • Generative AI
Remote or Hybrid
Jerusalem, ISR
360 Employees

LTX Logo LTX

Visual Generative AI Solutions Specialist

Conversational AI • Generative AI
Remote or Hybrid
Jerusalem, ISR
360 Employees

LTX Logo LTX

Security Guard

Conversational AI • Generative AI
Remote or Hybrid
Jerusalem, ISR
360 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account