Founding Applied Research Engineer

Posted 4 Days Ago
Be an Early Applicant
San Francisco, CA, USA
In-Office
Mid level
Information Technology • Software
The Role
Design and run research programs focused on agent systems, signal classification, cost-efficient inference, and behavioral benchmarking in applied AI. Build evaluation frameworks, conduct experiments, and convert findings into production systems to define applied AI research agendas.
Summary Generated by Built In
Why This Role Exists

Foundation models are commoditizing. Defensibility comes from specialized models, proprietary training signals, and evaluation ownership. Every applied AI company we benchmark against like Decagon, Harvey, Sierra, Cursor has already moved. The window to claim frontier applied AI for revenue is closing in the next few months.

Rox is in market. We run agents against enterprise data at scale, every day. We see exactly where research meets production and where the data is dirty, state is changing, and being wrong costs (a lot of) money.
The Applied Research team exists to close that gap permanently.

What This Team Works On

Four problems we care about right now:

Cost-efficient inference for Clever Columns. Distill a Rox-trained model from frontier teachers so per-account enrichment runs at 1/20th the cost without quality loss. Ships first. Doesn't require trajectory attribution.

Signal classification across the public knowledge graph. A small, fast classifier that distinguishes genuine buying signals from noise across the news, jobs, and filings corpus we already ingest at scale. Powers Recommended Next Moves and Auto Prospecting. Cleanest data subset.

Personalization grounding and hallucination detection. A reward model that catches fabricated prospect context in Sequences in real time. This is the most underrated production failure mode in outbound AI. Trained on cross-customer consensus edits.

Sequencing policy under sparse, delayed rewards. Offline-to-online RL on multi-touch trajectories with intermediate signals as proxies for terminal outcomes. Long-horizon flagship. Hard. [Depends on trajectory instrumentation in progress with Platform Eng.]

These are not benchmark problems. They have real SLAs and real customers depending on them.

What You'll Do
  • Design and run research programs tied directly to the four above.

  • Build evaluation frameworks that measure trajectory quality, not just final output, because most eval infrastructure measures end results and we care about the path.

  • Work on agent memory, retrieval, and context systems alongside elite and competitive engineering minds.

  • Translate findings into infrastructure with measurable production impact. Help define where Rox Research goes next.

What We're Looking For

You have spent real time thinking about how agents fail in practice, not just on benchmarks. You have built evaluation systems and know exactly where standard approaches break down. You can write code well enough to implement your own ideas, run your own experiments, and ship things that make it into production.

You move fast. The environment changes monthly and the team ships continuously.

Particularly relevant: agent evaluation and behavioral benchmarking; retrieval-augmented generation and knowledge graph systems; RL applied to real-world agent behavior; production ML systems (latency, reliability, observability); post-training and model adaptation for production use cases.

A PhD is not required. Strong research instincts and the ability to ship are.

What Success Looks Like

First few weeks: you understand Rox's architecture, where the production problems are, and where the research gaps are. You have opinions and you share them.

First few months: you are running experiments that directly inform how we build. Something you worked on is in production.

Over time: you are defining the research agenda for the most interesting applied AI problem in the enterprise. The systems you build are things no one else has built before, because no one else has the structural data position to build them.

Why Join Now

We are at an unusual moment. Large enough to have real scale, real customers, and genuinely interesting research problems. Small enough that you are one of a handful of people shaping what the Applied Research function looks like and what it prioritizes.

The team is extraordinary: IMO, IOI, and ICPC medalists, researchers from DeepMind and OpenAI. The feedback loop is a live enterprise system, not a leaderboard. If that's not more interesting to you than publishing for the sake of publishing, this probably isn't the right fit.

San Francisco, onsite. We relocate exceptional people.

Skills Required

  • Experience in the construction of evaluation systems for AI models
  • Knowledge in retrieval-augmented generation and knowledge graph systems
  • Experience applying reinforcement learning to real-world behaviors
  • Ability to write code and run experiments
  • Familiarity with production ML systems, including latency and reliability
  • Strong research instincts and innovative thinking
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
67 Employees
Year Founded: 2024

What We Do

Rox helps businesses secure and grow revenue.

Similar Jobs

General Motors Logo General Motors

Experience Prototyping Engineering Lead

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Hybrid
2 Locations
165000 Employees
189K-291K Annually

Adyen Logo Adyen

Strategic Incentive Manager

Fintech • Payments • Financial Services
Easy Apply
Hybrid
San Francisco, CA, USA
4771 Employees

Navan Logo Navan

Senior Communications Manager

Fintech • Information Technology • Payments • Productivity • Software • Travel • Automation
Easy Apply
Hybrid
San Francisco, CA, USA
3300 Employees
119K-264K Annually

Atlassian Logo Atlassian

Head of Data Engineering

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
In-Office or Remote
San Francisco, CA, USA
11000 Employees
194K-306K Annually

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account