Lead Research Engineer

Reposted 5 Days Ago
Be an Early Applicant
Hiring Remotely in Paris, Île-de-France, FRA
In-Office or Remote
Expert/Leader
Gaming
The Role
Lead research on model architecture for AI, designing structures to optimize performance while considering execution constraints. Engage in training pipeline ownership and collaborate with teams to enhance model capabilities.
Summary Generated by Built In

A shift is happening in AI that most people have not fully priced in. As models become more capable and agents take over more software work, inference becomes the critical bottleneck. The question stops being whether a model can do the work and becomes whether it can run fast enough to feel like thinking.

Kog was built for that shift.

We co-design the execution engine and the model architecture together, specifically for AMD MI300X hardware. Our monokernel runs from first token to last without returning control to the CPU. Our Laneformer architecture is designed to overlap computation and communication by deferring all-reduce by one layer.

Today, Kog serves 2,500 tokens per second. Our next target is 5,000.

Our MoE v3 already outperforms Llama 3.2-3B on CORE benchmarks and shows emergent reasoning capabilities where dense models of similar size score zero.

We are a team of 11 people, including 10 engineers and 4 PhDs, building a different kind of inference company from first principles.

Why this role matters now

Inference is becoming the constraint that shapes product quality, model design, and company velocity at the same time. At Kog, research is not upstream from execution. Research defines what can run fast, what can scale under real hardware constraints, and what kinds of capabilities become reachable in production.

This role sits at that junction. The work you do here will influence the next generation of Kog models, the structure of the training and evaluation loop, and the architectural decisions that determine whether performance gains are incremental or structural.

The problem

Most model research still assumes that architecture quality and execution quality can be optimized separately. That assumption leaves performance on the table and narrows the design space.

Kog took a different route. We co-design the model architecture and the execution engine together. LaneFormer is a direct expression of that approach. It is built to overlap computation and communication by deferring all-reduce by one layer, which changes what becomes possible inside the generation loop and what constraints every architectural decision must satisfy downstream.

This creates a harder research problem and a more interesting one. Progress comes from designing architectures that are mathematically sound, trainable in practice, and structurally aligned with the machine they will run on.

The role

You will own the model architecture roadmap at Kog. You will work on the boundary between research judgment, training reality, and execution constraints. You will decide how the next model generation is structured, trained, evaluated, and refined.

This is a hands-on leadership role. You will design architectures, write and review training code, shape experiments, make calls on model direction, and lead a team toward work that compounds. You will be expected to move with equal rigor between theory, implementation, and measured outcomes.

What you will work on

  • Architecture design for new model generations, including routing strategies, attention mechanisms, MoE structure, and architectural choices that improve capability while staying aligned with execution reality

  • Research directions that extend the Laneformer thesis and deepen the overlap between model design and engine design

  • Training pipeline ownership across convergence stability, distributed training efficiency, data pipeline design, post-training optimization, and evaluation methodology

  • Close collaboration with the GPU and systems team so that hardware constraints shape model choices early, and model choices create real execution advantages

  • Experiment strategy that favors fast, rigorous learning loops and turns architectural ideas into measurable decisions

  • Technical direction for a small team working on the critical path of model capability and generation speed

Must-have

  • You have trained large models or comparably demanding architectures from scratch and understand training dynamics at the level where you can diagnose issues from first principles

  • You have made architectural decisions that produced measurable improvements and can explain why they worked

  • You have strong fluency in Transformers, MoE, and at least one alternative architecture family with enough depth to reason across tradeoffs rather than within one paradigm

  • You can write production-grade research code in PyTorch, JAX, or an equivalent stack and bridge research ideas with robust implementation

  • You have operated with real ownership over difficult technical work and raised the standard of the people around you through research, judgment, code, and decision-making

  • You are comfortable carrying both individual technical depth and team-level responsibility in the same role

Strong signal

  • You have worked on architecture-hardware co-design, inference-aware training, or model decisions shaped directly by execution constraints

  • You have experience with post-training optimization such as speculative decoding, quantization-aware training, preference optimization, or related methods that materially affect deployment behavior

  • You have built systems where training, evaluation, and serving considerations had to be thought about together rather than in sequence

  • You have a public trace of serious research work with implementation depth, such as papers, repositories, benchmark results, technical writing, or open-source contributions adopted by others

Top 0.1% for this role

The strongest candidates for this role have already developed original architectural judgment. They have designed model structures that improved performance because of a specific decision they made, and they can explain the mechanism clearly. They understand that model design is constrained by training dynamics, communication patterns, memory behavior, and the realities of the execution path.

When they encounter an idea like Laneformer, they do not treat it as an isolated modeling trick. They immediately understand the downstream consequences for routing, layer structure, optimization, convergence, and generation speed. They know how to turn that understanding into a research program, a training plan, and a sequence of experiments that compounds.

They bring both authorship and taste. They know when a result is fundamental, when it is local, and when a model change is worth the system cost it introduces.

What we offer

  • Direct access to AMD MI300X clusters from day one, with enough compute to validate serious work at real scale

  • A team where technical judgment carries weight and where the people closest to the problem shape the key decisions

  • Problems that sit on the critical path of model execution speed and that directly influence what the system can become

  • A remote-first working model, with regular time overlap close to France time and monthly Paris weeks for engineering depth, alignment, and time together

  • Compensation aligned with top technical profiles in the Paris AI market, including meaningful equity

Skills Required

  • Trained large models or demanding architectures from scratch
  • Made architectural decisions leading to measurable improvements
  • Strong fluency in Transformers, MoE, and alternative architecture families
  • Write production-grade research code in PyTorch, JAX, or similar
  • Experience in ownership over complex technical work
  • Comfortable with technical depth and team-level responsibility
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
35 Employees
Year Founded: 1994

What We Do

KOG Studios is a South Korean video game developer based in Daegu that specializes in producing online free-to-play games, including Elsword, KurtzPel: Bringer of Chaos, and Grand Chase.

Similar Jobs

360Learning Logo 360Learning

Account Manager

Artificial Intelligence • Cloud • Edtech • HR Tech • Sales • Software • Generative AI
Easy Apply
In-Office or Remote
Paris, Île-de-France, FRA
400 Employees

ServiceNow Logo ServiceNow

Senior AI Agent Engineer - Moveworks | Customer Deployment

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Issy-les-Moulineaux, Hauts-de-Seine, Île-de-France, FRA
29000 Employees

ServiceNow Logo ServiceNow

Sales Executive

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Issy-les-Moulineaux, Hauts-de-Seine, Île-de-France, FRA
29000 Employees

Atlassian Logo Atlassian

Account Executive

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
In-Office or Remote
Paris, Île-de-France, FRA
11000 Employees

Similar Companies Hiring

DraftKings Thumbnail
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Boston, MA
6400 Employees
bet365 Thumbnail
Digital Media • Gaming • Software • Esports • Automation
Denver, Colorado
10000 Employees
ARB Interactive Thumbnail
Gaming • Software
Miami, Florida
175 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account