Delve Deeper

AI Enginner Lead

Posted 4 Days Ago

Be an Early Applicant

Warsaw, Warszawa, Mazowieckie, POL

Hybrid

250K-250K Annually

Senior level

AdTech • Agency • Digital Media • Marketing Tech • Social Media • Analytics • Big Data Analytics

Delve Deeper is a Performance Media Agency that helps clients grow their customer base by integrating the power of Tech.

The Role

The AI Engineer Lead will lead the development of the Start Stop Scale Agent, focusing on architecture, data engineering, evaluation frameworks, and team leadership to ensure production quality and operational reliability.

Summary Generated by Built In

WHO WE ARE

DELVE Deeper is a global performance media agency where data, technology, and marketing intersect.

We help brands like UNICEF, Virgin Voyages, and Orange grow by using data, analytics, and automation to drive measurable results. Our teams work at the intersection of media, data science, and technology in a fast-paced, international environment.

ROLE OVERVIEW

The mandate is to take the Start Stop Scale (SSS) Agent from design through stable production across a multi-agent architecture, a data-heavy operating model, and a strict evaluation-first delivery process. The successful candidate will lead hands-on across LLM systems, data pipelines, connector infrastructure, context management, observability, and production quality.

The role is intentionally shaped for an AI engineering lead with a strong data engineering background. This person owns the data platform decisions that make the agent usable in production, including ETL orchestration, data contracts, vector storage, API reliability, and context formatting for LLM consumption.

WHAT THIS ROLE WILL BUILD

The immediate focus is the SSS Agent, a workflow system that combines automated weekly optimization reporting with ad hoc analysis via Claude projects, Skills, and MCP connectors. The build scope includes the core agent stack, human-in-the-loop approval flows, and the data layer that supports daily and weekly decisioning.

A production-grade multi-agent system spanning planning, scale/stop, trendspotting, start, ad hoc analysis, report assembly, taxonomy verification, feedback verification, and daily callouts
A data platform that supports 17 discrete SSS data components with clear ownership, freshness controls, and fit-for-purpose formatting for agent use
A reusable MCP connector and tool layer for Semrush, SerpAPI, Slack, and media platform APIs
A strict evaluation layer that defines success before build, measures quality during development, and monitors drift in production
A feedback loop that captures trader approvals, rejections, rationale, and operational signals back into the system

KEY OUTCOMES AND DELIVERABLES

Evaluation frameworks and success criteria defined before feature development starts
All core SSS sub-agents shipped to stable production with clear input, output, and failure-mode documentation
17 data components designed, normalized, and governed through explicit data contracts
Automated ETL or ELT pipelines supporting daily callouts and weekly optimization reporting
Production-ready connector library with resilient retry logic, rate-limit handling, and fallback behavior
Vector database or retrieval layer supporting long-term memory, similarity retrieval, and context assembly
Monitoring, alerting, and quality regression checks across model outputs, connectors, and pipelines
Slack-based human-in-the-loop workflows for approvals, feedback capture, taxonomy exceptions, and operational callouts

CORE RESPONSIBILITIES

1. Agent systems and platform architecture

Design and implement the end-to-end architecture for the SSS Agent, with reusable patterns for orchestration, tool use, memory, and failure handling.
Own context window strategy across the agent system, including chunking, retrieval, summarization boundaries, and contamination prevention.
Build platform components that allow rapid iteration without compromising production standards, including internal libraries, shared services, and testing utilities.
Translate product intent into technical architecture that preserves the integrity of the SSS decision logic.

2. Data engineering and information architecture

Own the design of the data layer that powers agent decisions, including ingestion, normalization, storage, schema discipline, and data freshness.
Build and maintain automated ETL or ELT pipelines pulling data from media platforms and internal sources on the cadence required by the product.
Define and enforce data contracts between upstream systems and the agent layer so schema changes are managed deliberately, not discovered at runtime.
Own vector storage, retrieval indexing, and context formatting so long-history performance data remains usable as volume grows.
Design rate-limit controls, circuit breakers, retries, and fallback strategies so data gaps do not silently degrade agent quality.

3. Evaluation, experimentation, and quality control

Define evaluation frameworks, success thresholds, and regression suites before any feature enters development.
Run systematic prompt testing and AI-versus-human benchmarking to ensure agent outputs meet the quality bar required for live use.
Establish confidence scoring, exception handling, and escalation logic for uncertain recommendations.
Monitor production outputs for drift, quality regressions, hallucination patterns, and connector or data degradation.
Set and enforce a clear production-readiness bar covering code quality, test coverage, documentation, and operational safeguards.

4. Delivery leadership and team direction

Lead engineers as a hands-on technical lead: decompose work, review code, unblock execution, and maintain development velocity.
Partner tightly with the Head of AI Transformation on backlog sequencing, scope realism, technical trade-offs, and build-vs-buy recommendations.
Keep the build order aligned to dependency risk, prioritizing upstream components that unlock downstream reliability.
Mentor engineers working in an AI-first codebase while maintaining a high standard for clarity, speed, and disciplined iteration.

5. Production operations and platform reliability

Implement logging, tracing, monitoring, and alerting across model behavior, pipeline health, API usage, and user-facing failures.
Respond to production issues quickly and drive root-cause fixes rather than surface-level patches.
Set cost, rate, and usage limits deliberately so the system remains commercially viable as usage scales.

CANDIDATE PROFILE

Must-have experience

7+ years across software engineering, data engineering, ML engineering, or AI platform work, including direct ownership of production systems
2+ years leading technical delivery for complex systems, with evidence of setting standards and raising execution quality
Strong Python and SQL, plus hands-on experience building APIs, services, and robust data pipelines
Deep experience with ETL or ELT design, schema management, and data platform reliability in production
Hands-on experience building or operating LLM applications, agentic systems, tool-calling workflows, or comparable AI application layers
Experience with cloud infrastructure and containerized deployments (AWS, Azure, or GCP; Docker and ideally Kubernetes)
Strong grounding in software engineering discipline, including testing, code review, CI or CD, observability, and incident response

Strongly preferred

Experience with workflow orchestration and data tooling such as Airflow, Dagster, Prefect, dbt, Kafka, or similar platforms
Experience with vector databases, retrieval systems, similarity search, and long-context data handling
Familiarity with MCP, or equivalent integration layers that connect AI systems to enterprise tools and APIs
Experience with performance marketing, ad-tech, or media platform APIs such as Google Ads, Meta, DV360, Semrush, or SerpAPI
Experience shipping systems that mix model logic, deterministic business rules, and human approval flows
Comfort translating messy business logic into precise technical rules without flattening the nuance

Working style

Player-coach mindset: willing to own architecture, write code, review code, and stay close to production realities
Decisive on trade-offs, but disciplined enough to define done before build starts
Clear communicator with non-technical stakeholders, especially when setting constraints, risks, and sequencing
Pragmatic about delivery: values stable systems, measurable quality, and reusable components over novelty

THIS ROLE IS NOT

A workflow adoption or change-management role
A low-code automation builder role
A research-only AI scientist role
A people manager removed from hands-on system ownership
A generic prompt engineer role without responsibility for data, platform, and production quality

WHAT WE OFFER

Hybrid working model: three days in the office (Tuesday to Thursday)
A competitive salary with opportunities for growth
Private medical care at Medicover
Multisport card
Annual education budget of $250
Generous employee referral program
Catered office lunch every Tuesday
Snacks and occasional breakfasts available in the office

Please submit your CV in English!

Top Skills

Airflow

APIs

AWS

Azure

Dagster

Dbt

Docker

Elt

ETL

GCP

Kafka

Kubernetes

Prefect

Python

SQL

View all jobs at Delve Deeper

View Delve Deeper Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Louisville, CO

150 Employees

Year Founded: 2011

What We Do

Delve Deeper delivers digital marketing management, first-party data science and consulting, and adtech/martech systems integration & reselling on a global scale. We connect the dots between data and technology in media by identifying our clients’ super fans, deterministically finding more of them, and converting them online in the most effective way.

Why Work With Us

We act as one highly functioning team that is powered by our professional “Fire in the Belly”, with a passion for creating exceptional value by delighting our clients and creating an engaging work environment for our team members. Our culture emphasizes professional development in an environment where everyone can have an impact.