Head of Data Platform

Posted 6 Days Ago
Be an Early Applicant
Hiring Remotely in Denver, CO, USA
In-Office or Remote
160K-195K Annually
Senior level
Automotive • Information Technology • Software
Revolutionizing automotive retail with our Dealer Experience Platform™ and a groundbreaking partnership with Amazon.
The Role
Lead design and ownership of a multi-tenant data platform and feature store to enable closed-loop ML for dealer optimization. Build pipelines, model data flows, action ledger, governance, migration strategy, and automated data quality/monitoring. Hands-on in SQL/code, architect data schemas, ensure tenant isolation, and operationalize training and serving for SageMaker and Bedrock-powered optimization functions.
Summary Generated by Built In
Why This Role Exists

We operate a multi-tenant automotive SaaS platform serving thousands of dealer groups across the United States. Our data layer — MySQL, Aurora, DynamoDB with DynamoDB Streams, S3, Glue Data Catalog — has grown to support a complex, high-throughput transactional platform. That layer works. Now we need to make it intelligent.

We are building a Dealer Intelligence Platform: a closed-loop system that observes raw signals from dealer operations, predicts outcomes, optimizes decisions under constraints, acts through approved channels, and learns from what happened. Pricing optimization, lead routing, inventory mix planning, service bay scheduling — each is a self-contained optimization function that consumes features, scores predictions, and closes the loop with action telemetry.

This role owns the entire data substrate that makes that loop possible — the lake, the feature store, the model registry, the action ledger, and the governance framework that keeps it all tenant-isolated and audit-grade. You are not inheriting a finished architecture; you are designing the one that turns a transactional platform into a decision engine.

Scope & Scale
  • Targeting 5,000+ dealer tenants, each with isolated databases and per-tenant configuration.
  • Billions in annual Gross Merchandise Value (GMV) flowing through platform transactions.
  • 30+ third-party integrations across DMS, CRM, lending, F&I, and marketplace providers — each pushing data in different formats (SOAP/XML, REST, JSON, email).
  • Data pipelines spanning 6 integration domains with multi-protocol vendor connectivity.
  • Data Streams processing real-time change events across onboarding, inventory, and transaction tables.
  • A Product roadmap with 6+ optimization functions — each requiring its own entity model, feature set, constraint definition, and feedback path.
What You Will Own
  • The data platform end to end: Bronze (raw + telemetry), Silver (canonical entities via Standardization Agent), Gold (KPIs + derived features) — plus the Feature Store and Action Ledger that make the optimization loop possible.
  • Feature Store architecture — online (sub-50ms reads for real-time scoring) and offline (point-in-time joins for training). Feature contract: owner, freshness SLO, PII tag, training/inference parity. Governed by Lake Formation.
  • Action Ledger — every recommendation, approval, override, and outcome logged as a first-class object. This is the substrate that closes the loop: without it, models cannot retrain, we cannot attribute lift, and we cannot prove value to dealers.
  • Model data pipeline — the feature materialization, training data assembly, and serving infrastructure that feeds SageMaker models and Bedrock agents across all optimization functions.
  • Data migration strategy for the legacy platform — defining which tables move to DynamoDB, which consolidate into Aurora Serverless, and how dual-write validation works at every stage.
  • Data quality and anomaly detection — automated monitoring for schema drift, null-rate spikes, stale pipelines, and integration data inconsistencies. If a bad feature reaches a model, that is your bug.
  • Data governance: retention policies, PII handling, audit trail integrity, and multi-tenant AI governance (per-dealer data isolation for model training, cost attribution for Bedrock inference, FTC-grade action audit).
  • Streaming and event-driven data flows: DynamoDB Streams, EventBridge, CDC patterns, and real-time feature materialization.
Technical Environment
  • Data Stores: DynamoDB (single-table design, DynamoDB Streams), MySQL/Aurora, S3 data lake (Bronze/Silver/Gold layers, Iceberg), DAX, Glue Data Catalog.
  • Feature Store: SageMaker Feature Store (online + offline), S3 + Iceberg for point-in-time training data, Lake Formation-governed feature access.
  • Streaming & Events: DynamoDB Streams, EventBridge, CDC, real-time feature materialization pipelines.
  • ML/AI Platform: SageMaker (training pipelines, model registry, Model Monitor), AWS Bedrock (Claude, Titan), LangChain/LangGraph, Bedrock Knowledge Bases.
  • Pipelines: Apache NiFi (multi-provider normalization), AWS Glue (ETL, Data Catalog), Athena (ad-hoc analytics).
  • Infrastructure: CloudFormation, Lake Formation, OpenTelemetry, CircleCI.
  • Analytics: Glue Data Catalog, Athena, QuickSight or equivalent BI layer.
Hands-On Expectations

This is not a slides-and-meetings role. We expect roughly 30–40% of your time in code and SQL (writing critical-path feature pipelines, data migrations, schema designs, and reviewing PRs with rigor), 40–50% in design (data architecture docs, entity modeling sessions, feature catalog design, optimization-function data contracts), and 10–20% on cross-team alignment, mentoring, and operating mechanisms.

First 12 Months
  • Months 1–3: Immerse in the data landscape — audit every data store, pipeline, integration flow, and DynamoDB Stream. Map the current entity model against the v2 optimization-function requirements. Publish the first data architecture ADR. Identify the top data quality gaps and define the canonical entity schemas that Silver and Gold will standardize around.
  • Months 4–6: Stand up the Feature Store (online + offline) and migrate the first round of features from Gold derivations. Build the Action Ledger schema and approval-gateway integration. Deliver the first data migration wave with dual-write validation. Ship automated data quality monitoring that catches drift before it reaches a model.
  • Months 7–9: Pricing Optimizer data pipeline live in production — features flowing, predictions scoring, action telemetry closing the loop. Feature catalog operational with freshness SLOs and PII tagging. Per-dealer + global model training pipelines running. Lead Routing data contracts defined and in shadow mode.
  • Months 10–12: Second optimization function (Lead Routing or Inventory Mix) in production. Drift detection and auto-rollback wired into Model Monitor. Outcome dashboards proving incremental lift to dealers. Data platform roadmap documented for the next 12 months. The team treats feature quality, action telemetry, and training/inference parity as non-negotiable engineering standards because of the patterns you set.

RequirementsYou Should Have
  • 7+ years in data engineering or data architecture with at least 2 years in a platform-level architect or Head-of role.
  • Deep experience with both relational (MySQL/PostgreSQL/Aurora) and NoSQL (DynamoDB, DynamoDB Streams) data modeling — and strong opinions on when each is appropriate.
  • Hands-on experience building or operating feature stores — online serving, offline training, point-in-time correctness, feature freshness SLOs.
  • Hands-on experience with AWS data and ML services: Glue, Athena, S3, DynamoDB, DynamoDB Streams, Aurora, Lake Formation, SageMaker.
  • Current, practicing AI/Gen-AI practitioner — you have built or operated systems that prepare data for LLMs, embeddings, or ML models in production. This is not theoretical interest; it is hands-on recent experience.
  • Experience with streaming and CDC patterns: DynamoDB Streams, Kinesis, EventBridge, or Kafka for real-time data propagation and feature materialization.
  • Experience with data pipeline orchestration: NiFi, Airflow, Step Functions, or equivalent.
  • Understanding of data migration patterns: dual-write, change data capture (CDC), reconciliation validation, zero-downtime cutover.
  • Experience with multi-tenant data architectures — database-per-tenant, schema-per-tenant, Row-Level Security, and the judgment to know which trade-offs matter at 5,000+ tenants.
  • Strong data governance instincts: retention policies, PII handling, audit trails, cost attribution, and the discipline to enforce them before the first model ships.
  • The ability to write a data architecture doc that engineers can implement without ambiguity, and the judgment to know when to write the SQL or pipeline code yourself instead.
Strongly Preferred: ML & Optimization Data Infrastructure
  • Hands-on experience with SageMaker — training pipelines, model registry, Model Monitor, and production model lifecycle (shadow → canary → drift watch → auto-rollback).
  • Hands-on experience with AWS Bedrock — Knowledge Bases, model invocation, guardrails, and production deployment of foundation models.
  • Experience building embeddings pipelines — vectorizing structured and unstructured data for semantic search, recommendations, or retrieval-augmented generation.
  • Experience designing retrieval systems — chunking strategies, metadata filtering, re-ranking, and evaluating retrieval quality.
  • Experience building closed-loop data systems — action telemetry, outcome attribution, A/B holdout management, and lift measurement.
  • Experience with data quality and anomaly detection at scale — automated monitoring for schema drift, null rates, freshness SLAs, and feature/training skew.
  • Understanding of multi-tenant AI governance: PII redaction, tenant-scoped inference, per-dealer model routing, cost attribution, and audit logging.
Nice to Have
  • Experience with automotive, fintech, or multi-tenant marketplace data — compliance retention requirements.
  • Familiarity with data formats and protocols from third-party providers (CDK, DealerTrack, Tekion).
  • Experience with constrained optimization, assignment problems, or scheduling solvers at the data layer.
  • Background with event-driven and streaming data architectures (EventBridge, DynamoDB Streams, Kafka, CDC streams).
  • Experience with vector databases (OpenSearch, Pinecone) for production AI workloads.
  • Experience with Iceberg, Delta Lake, or other open table formats for lakehouse architectures.

BenefitsAbout A2Z Sync

A2Z Sync is a fast-paced and innovative automotive SaaS company seeking to make life better for our customers. We offer you a fun, casual, and collaborative culture, while fostering an environment where you work hard, see your results, and feel your impact. We are committed to our employees, and this starts with providing benefits that allow you to care for you and your family.

Mission

At A2Z Sync, we replace the friction of disconnected systems with the velocity of a single platform. We integrate digital insights with in-store operations to deliver transparent transactions that bring clarity to the car buyer and increased profitability to the dealer.

Our Values: We Are DRIVEN
  • Dealership Obsessed: We measure our success by the dealer's wins and the trust of their buyers, not just our own code.
  • Relentless Ownership: No lone wolves, but no pass-backs either. We don't say "that's not my job."
  • Invent with Purpose: We don't chase "shiny" tech. We replace guesswork with intelligence, building the "data backbone" that turns raw information into a competitive advantage.
  • Value Every Perspective: We are Better Together. We check egos at the door.
  • Evolve or Evaporate: Change is our constant. We stay ahead by learning faster than the competition.
  • Now Over Next: Perfection is the enemy of progress. We prefer action over endless analysis.
Here’s how we are doing it:
  • A2Z Sync offers comprehensive medical, dental, and vision benefits.
  • Employer provided STD/LTD and life insurance.
  • Matching 401k plan.
  • Unlimited paid time off, including 10 paid holidays.
  • Real ownership of a high-stakes AI surface — your roadmap, your architecture decisions, your metrics.
  • The expected salary range for this role is $160,000 to $195,000 annually, commensurate with experience and qualifications.

Skills Required

  • 7+ years in data engineering or data architecture with at least 2 years in a platform-level architect or Head-of role
  • Deep experience with relational (MySQL/PostgreSQL/Aurora) and NoSQL (DynamoDB, DynamoDB Streams) data modeling
  • Hands-on experience building or operating feature stores (online serving, offline training, point-in-time correctness, freshness SLOs)
  • Hands-on experience with AWS data and ML services: Glue, Athena, S3, DynamoDB, DynamoDB Streams, Aurora, Lake Formation, SageMaker
  • Current, practicing AI/Gen-AI practitioner with hands-on production experience preparing data for LLMs, embeddings, or ML models
  • Experience with streaming and CDC patterns (DynamoDB Streams, Kinesis, EventBridge, or Kafka)
  • Experience with data pipeline orchestration (NiFi, Airflow, Step Functions, or equivalent)
  • Understanding of data migration patterns: dual-write, CDC, reconciliation validation, zero-downtime cutover
  • Experience with multi-tenant data architectures (database-per-tenant, schema-per-tenant, Row-Level Security) at scale
  • Strong data governance instincts: retention policies, PII handling, audit trails, and cost attribution
  • Ability to write clear, implementable data architecture documents and willingness to write SQL/pipeline code as needed
  • Hands-on experience with SageMaker training pipelines, model registry, Model Monitor, and production model lifecycle
  • Hands-on experience with AWS Bedrock, Knowledge Bases, model invocation, and guardrails
  • Experience building embeddings pipelines and vectorization for semantic search or RAG
  • Experience designing retrieval systems: chunking, metadata filtering, re-ranking, evaluation
  • Experience building closed-loop data systems: action telemetry, outcome attribution, A/B holdout management, lift measurement
  • Experience with data quality and anomaly detection at scale (schema drift, null rates, freshness SLAs, skew)
  • Understanding of multi-tenant AI governance: PII redaction, tenant-scoped inference, cost attribution, audit logging
  • Experience with automotive, fintech, or multi-tenant marketplace data (compliance/retention requirements)
  • Familiarity with third-party provider data formats/protocols (CDK, DealerTrack, Tekion)
  • Experience with constrained optimization, assignment problems, or scheduling solvers at the data layer
  • Experience with vector databases (OpenSearch, Pinecone) for production AI workloads
  • Experience with Iceberg, Delta Lake, or other open table formats for lakehouse architectures
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Greenwood Village, CO
66 Employees
Year Founded: 2017

What We Do

A2Z Sync is transforming automotive retail with our groundbreaking Dealer Experience Platform™ (DXP™), seamlessly uniting the digital and in-store car-buying experience. Unlike fragmented solutions, our platform was designed with the in-store experience at its core, ensuring that pricing, payments, and deal structures remain consistent across both online and showroom interactions. This creates a smooth, transparent handoff for customers and a faster, more profitable process for dealerships. Our all-in-one platform integrates every element of a car deal—from pricing inventory to finalizing transactions—empowering operators and sales leaders with real-time visibility through powerful tools like the manager dashboard. With A2Z Sync, dealerships can deliver a streamlined, customer-centric buying experience that drives satisfaction and profitability. In a revolutionary step forward, A2Z Sync has partnered with Amazon to redefine the future of automotive retail. Together, we’re setting a new standard for innovation, efficiency, and customer experience in the industry.

Why Work With Us

At A2Z Sync, we’ve built a fun, casual, and collaborative workplace where your hard work drives real results and your impact is felt daily. We prioritize employee well-being with employer-paid health benefits, flexible hybrid schedules, unlimited PTO, 401(k) matching, mental health support, and even pet insurance. Join us to make a difference!

Gallery

Gallery

Similar Jobs

Remote
USA
618 Employees
223K-270K Annually
Remote
US
6000 Employees
154K-193K Annually

Applied Systems Logo Applied Systems

Manager, Infrastructure Security

Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
Remote or Hybrid
United States
3079 Employees
100K-160K Annually

Applied Systems Logo Applied Systems

Lead Product Insights Analyst

Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
Remote or Hybrid
United States
3079 Employees
110K-140K Annually

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account