Data Engineer

Posted 6 Days Ago
Be an Early Applicant
Prague, CZE
Hybrid
Mid level
Web3 • Automation
The Role
As a Data Engineer, you will manage integration between Snowflake and various operational tools, ensuring data accuracy and availability for Sales, Marketing, and Product teams. Responsibilities include building reliable data pipelines, designing CDP layers, and resolving pipeline incidents.
Summary Generated by Built In

Apify is the largest marketplace of tools for AI. 30,000+ Actors helping people and agents get real-time web data, track competitors, generate leads, or integrate their apps. Actors are built by a global creator community that now earns more than $1M every month.

Join us to help people put the web to work. Apify can find missing children, protect consumers from fake discounts across the EU, and feed data to AI chatbots.

We're looking for a Data Engineer to own the integration layer between Snowflake and the operational tools that run Apify's go-to-market and product motion: HubSpot, Intercom, Mixpanel, and Segment. You'll make sure the right data lands in the right system at the right time, with the right shape, so Sales, Marketing, Customer Success, and Product teams can act on it.

You'll be the 9th member of the data team - joining a mix of analytical engineers, analysts, and data scientists - at the moment Segment is being rolled out as Apify's CDP. That's yours to land end-to-end.

What you'll be working on:
  • Own the integration domain end to end - all pipelines, transformations, and Snowflake models that connect HubSpot, Intercom, Mixpanel, and Segment to the rest of the platform, in both directions.

  • Design event tracking and the CDP layer with the RevOps team as Segment becomes the source of truth for behavioral data flowing into product, marketing, and CRM systems.

  • Build reliable, observable pipelines in Keboola and dbt - with clear data contracts, schema tests, freshness guarantees, and alerting.

  • Model integration data in Snowflake so HubSpot, Intercom, Mixpanel, and Segment data lands in well-defined tables that downstream consumers can trust, with documentation that analysts and scientists can actually use.

  • Power lifecycle automations - PQA scores back into HubSpot, behavioral campaigns in Intercom and customer.io, product usage signals - by shipping the data they depend on.

  • Diagnose and resolve pipeline incidents independently - trace lineage across multiple components, find root causes, fix, and write the runbook so it doesn't bite the next person.

Tech stack
  • Snowflake - data warehouse

  • Keboola - extractors, writers, and orchestration

  • dbt - transformations on Snowflake (orchestrated by Keboola; this is where we're actively migrating existing transformation logic)

  • Tableau and Redash - BI

  • n8n - workflow automation

  • Segment - CDP, currently being rolled out end-to-end

Who we're looking for:
  • 3+ years of data engineering experience, with meaningful time spent on integrations between a cloud warehouse and operational SaaS tools (HubSpot, Salesforce, Intercom, Zendesk, Mixpanel, Amplitude, Segment, RudderStack, or similar).

  • Fluent in SQL (window functions, CTEs, complex multi-source joins, query optimization) and comfortable in Python for the parts a no-code tool can't handle.

  • Production experience with Snowflake (or BigQuery, Databricks, Redshift), and an understanding of the cost, performance, and access-control tradeoffs of a usage-based warehouse.

  • Experience building end-to-end pipelines combining an orchestration or ELT platform (Keboola, Fivetran, Airflow, Dagster, Prefect, Matillion) with a transformation framework like dbt.

  • Hands-on experience with a CDP (Segment, RudderStack, mParticle) - tracking plans, schemas, identity resolution, downstream consumers - not just installing the snippet.

  • You think in data contracts - schema stability, freshness SLAs, documented field definitions - and treat the boundary between your domain and downstream consumers as a first-class interface.

  • Comfortable with reverse ETL (Census, Keboola, or hand-rolled), and you understand what it means to write back to a CRM that humans are also editing.

  • Pragmatic about tooling - happy to use n8n for the right job, and equally happy to write proper code when that's the right call.

  • Able to explain why a dashboard moved and what it means to non-technical stakeholders in Sales, Marketing, and Customer Success, in English, both in writing and in person.

Nice to have:
  • Experience with usage-based billing or product-led growth data models.

  • Exposure to LLM-assisted workflows in the data stack.

  • Prior experience at a SaaS company between 50 and 500 people.

By the end of the first month, we expect you to:
  • Know the data team, the RevOps and Growth stakeholders who depend on the integration layer, and the workflows that flow through HubSpot, Intercom, Mixpanel, and Segment.

  • Work through the existing Keboola components and dbt models to understand what's in place, what's fragile, and where the silent failures live.

  • Trace a typical record from each source system through to the Snowflake tables analysts use.

By the end of the first 3 months, we expect you to:
  • Have a complete map of the integration domain - what flows where, what's owned by whom, where the silent failures are - and a documented six-month plan for the work ahead.

  • Have at least one end-to-end improvement shipped with monitoring in place.

  • Be the go-to person on the data team for HubSpot, Intercom, Mixpanel, and Segment data questions.

By the end of the first 6 months, we expect you to:
  • Have Segment operating as the durable CDP for Apify, with a published tracking plan and reliable event flows into Snowflake and downstream tools.

  • Have core tables from HubSpot, Intercom, Mixpanel, and Segment with documented data contracts - schema, freshness SLA, ownership - and tests and alerting in place.

  • Have driven measurable improvements in data freshness, pipeline reliability, and incident response time, tracked publicly, and shipped at least one cross-team initiative where the data integration unlocked a business outcome (conversion lift, churn reduction, ops automation).

Why should you work at Apify?
  • Space, support, and autonomy for personal growth, with a direct impact on our success

  • Full-time position in Prague (Lucerna Palace)

  • Flexible working hours (perfect for both night owls 🦉 and early birds 🐥)

  • Nobody counts holidays as long as the work gets done 💪

  • Unlimited Claude for every Apifier. We don't count tokens. Just use them well 🤖

  • Stock options and profit sharing 💰

  • Free Multisport card

  • We welcome pets, kids, and bikes in the office

  • Epic team buildings and offsites 🚢 with biking, canoeing, and other adventures 🪂

  • Solid education and training budget, conference tickets, internal “Eat & Learn” sessions, and the possibility to work across teams

  • Generous hardware budget 💻

  • Free lunches every day when working from the office 🌮🥡

  • Unlimited supply of ☕ & 🍺 and snacks

  • Free entry to the wonderful Prague and Brno Zoo 🐘

  • Ping-pong, chess, PS5, lightsabers, foosball league after lunch.

For more details about Apify and what it’s like to work with us, see our Careers page.

Skills Required

  • 3+ years of data engineering experience with integrations
  • Fluent in SQL and comfortable in Python
  • Production experience with Snowflake or similar data warehouses
  • Experience building end-to-end data pipelines
  • Hands-on experience with a Customer Data Platform
  • Ability to explain data implications to non-technical stakeholders
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Prague
97 Employees
Year Founded: 2015

What We Do

Apify is a full-stack web scraping and browser automation platform that lets you extract data from websites and automate workflows on the web. With Apify, you can turn any website into an API!

Similar Jobs

Accenture Logo Accenture

Data Engineer

Information Technology
In-Office
Prague, CZE
456553 Employees

Accenture Logo Accenture

Data Engineer

Information Technology
In-Office
Prague, CZE
456553 Employees

Accenture Logo Accenture

Data Engineer

Information Technology
In-Office
Prague, CZE
456553 Employees

TD SYNNEX Logo TD SYNNEX

Data Engineer

Information Technology • Software
In-Office or Remote
5 Locations
22000 Employees

Similar Companies Hiring

Parsec Automation Thumbnail
Artificial Intelligence • Information Technology • Internet of Things • Software • Analytics • Automation • Manufacturing
Anaheim, California
99 Employees
Yooz Thumbnail
Software • Machine Learning • Fintech • Financial Services • Cloud • Automation • Artificial Intelligence
Aimargues, FR
470 Employees
Rain Thumbnail
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3 • Infrastructure as a Service (IaaS)
New York, NY
100 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account