Lead Data Engineer

Posted 5 Hours Ago
Be an Early Applicant
Hiring Remotely in Canada
Remote
120K-150K Annually
Mid level
Music
The Role
Lead design and implement a scalable Medallion lakehouse on GCS and Databricks. Build and govern ingestion and transformation pipelines (Astro/Airflow, PySpark, Delta Live Tables), apply domain-driven data modeling, manage Unity Catalog access and lineage, collaborate with analytics and vendors, and proactively surface risks with clear documentation.
Summary Generated by Built In

Established in 2015, Create Music Group is a leading music and entertainment company. The company operates as a record label, distribution company, and entertainment network which generates over 15 billion music streams each month on DSP’s. Named #2 on the Inc 5000 Fastest Growth Companies in America in 2020, the company has grown exponentially by leveraging its owned IP with its media and technology platform. The company works with superstar artists, major and independent record labels, and global media brands. It operates a number of companies including Label Engine, one of the largest independent music distribution platforms in the world, with over 75,000 artists and 5,000 label clients; and Flighthouse, a digital entertainment brand focused on Gen Z,  which has more than 300 million followers across social media. Create Music Group is based in Hollywood, CA and has 400 employees worldwide.


Job Summary

The Lead Data Engineer will play a central role in the buildout of CMG's next-generation data platform. This is a high-ownership role on a small, senior team, working directly with the SVP of Data & AI to design and implement a scalable lakehouse architecture on Google Cloud Storage (GCS) and Databricks, spanning bronze, silver, and gold layers. The role emphasizes domain-driven design, data contracts, and proactive communication with both internal stakeholders and external vendors.


Responsibilities

  • Lead the technical design and implementation of CMG's Medallion 2.0 lakehouse architecture — bronze ingestion, silver transformation, and gold domain layers — built on GCS and Databricks (Delta Lake), with clear data contracts at each boundary
  • Design and manage data pipelines using Astro (Airflow), PySpark, and Delta Live Tables, ensuring reliability and scalability across ingestion and transformation layers
  • Govern the lakehouse using Databricks Unity Catalog — managing access controls, data lineage, and schema enforcement across domains
  • Apply domain-driven design principles to partition and model data domains (e.g., royalty, asset, artist, distribution)
  • Collaborate with the analytics team to ensure the gold layer reflects real business needs — reducing workarounds
  • Coordinate with external vendors (e.g., DataArt) and internal stakeholders across DevOps, product, and analytics
  • Proactively identify architectural risks, data quality issues, and dependency blockers with proposed resolutions
  • Maintain clear, impact-first documentation and status updates for both technical and non-technical stakeholders
  • Other duties as assigned

Qualifications 

  • 4+ years of data engineering experience, with at least 1–2 years focused on data platform or lakehouse architecture
  • Hands-on experience with Databricks — including Delta Lake, PySpark, and ideally Unity Catalog
  • Experience with GCS or equivalent cloud object storage as a lakehouse foundation layer
  • Hands-on experience with domain-driven design applied to data modeling
  • Strong command of SQL and at least one transformation framework (dbt preferred)
  • Experience with medallion or lakehouse architectures (bronze/silver/gold or equivalent)
  • Familiarity with GCP-native tooling — Pub/Sub, Dataflow, or Dataplex a plus
  • Excellent written communication — able to write design docs non-engineers can understand and status updates executives can act on
  • Demonstrated ability to work independently in ambiguous environments
  • Track record of flagging risks early with proposed solutions

Nice to have: Experience in music/media/entertainment data; familiarity with data contracts or schema validation (Unity Catalog, Great Expectations, dbt tests); experience with external dev vendors


Pay Scale

  • $120,000 - $150,000 CAD per year
  • The final compensation within this range will be determined based on the candidate’s experience, skills, and overall fit for the role.

Skills Required

  • 4+ years of data engineering experience, with at least 1-2 years focused on data platform or lakehouse architecture
  • Hands-on experience with Databricks (including Delta Lake and PySpark)
  • Experience with Databricks Unity Catalog
  • Experience with Google Cloud Storage (GCS) or equivalent cloud object storage for a lakehouse
  • Hands-on experience applying domain-driven design to data modeling
  • Strong command of SQL
  • Experience with at least one transformation framework
  • Experience with dbt
  • Experience with medallion/lakehouse architectures (bronze/silver/gold)
  • Familiarity with GCP-native tooling (Pub/Sub, Dataflow, Dataplex)
  • Excellent written communication and ability to produce clear design docs and status updates
  • Ability to work independently in ambiguous environments
  • Track record of identifying and flagging risks early with proposed solutions
  • Experience in music/media/entertainment data
  • Familiarity with data contracts or schema validation (Great Expectations, dbt tests, Unity Catalog)
  • Experience coordinating with external development vendors
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Los Angeles, CA
0 Employees
Year Founded: 2015

What We Do

We are Create Music Group, one of the fastest growing music technology companies in the entertainment industry and one of the largest music rights holders in the world. We operate a number of different services: music distribution, music publishing, YouTube monetization, artist development, brand development, and production.

Similar Jobs

Labelbox Logo Labelbox

Forward Deployed Engineer

Artificial Intelligence • Information Technology • Machine Learning
In-Office or Remote
7 Locations
115 Employees
140K-200K Annually

PwC Logo PwC

Designer

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Remote or Hybrid
63 Locations
370000 Employees
151K-187K Annually

Affirm Logo Affirm

Senior Product Manager

Big Data • Fintech • Mobile • Payments • Financial Services
Easy Apply
Remote
Canada
2200 Employees
153K-213K Annually

Block Logo Block

Senior Manager, M&A and Tax Planning

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
In-Office or Remote
8 Locations
12000 Employees
130K-234K Annually

Similar Companies Hiring

Peaksware Thumbnail
Fitness • Music • Software
Louisville, CO
245 Employees
Bose Thumbnail
Automotive • eCommerce • Hardware • Music • Retail • Software • Wearables
Framingham, MA
2900 Employees
TIDAL Thumbnail
Software • News + Entertainment • Mobile • Information Technology • Music • Consumer Web
New York, NY
450 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account