Senior Python Engineer, DataHub Ingestion Framework

Reposted 13 Days Ago
Be an Early Applicant
Palo Alto, CA
In-Office
225K-300K
Senior level
Database
The Role
Lead the development of DataHub's ingestion framework, creating scalable, fault-tolerant systems and intuitive APIs for enterprise metadata management at scale.
Summary Generated by Built In

DataHub is an AI & Data Context Platform adopted by over 3,000 enterprises, including Apple, CVS Health, Netflix, and Visa. Innovated jointly with a thriving open-source community of 13,000+ members, DataHub's metadata graph provides in-depth context of AI and data assets with best-in-class scalability and extensibility.

The company's enterprise SaaS offering, DataHub Cloud, delivers a fully managed solution with AI-powered discovery, observability, and governance capabilities. Organizations rely on DataHub solutions to accelerate time-to-value from their data investments, ensure AI system reliability, and implement unified governance, enabling AI & data to work together and bring order to data chaos.

The Challenge

As AI and data products become business-critical, enterprises face a metadata crisis:

  • No unified way to track the complex data supply chain feeding AI systems
  • Engineering teams struggling with data discovery, lineage, and governance
  • Organizations needing machine-scale metadata management, not just human-browsable catalogs
Why This Matters

This is where infrastructure meets impact. The metadata layer you'll build will directly power the next generation of AI systems at massive scale. Your code will determine how safely and effectively thousands of organizations deploy AI, affecting millions of users worldwide.

The Role

We're looking for an exceptional Python engineer to lead development of DataHub's ingestion framework – the core that connects diverse data systems and powers our metadata collection capabilities.

You'll Build
  • Scalable, fault-tolerant ingestion systems for enterprise-scale metadata
  • Clean, intuitive APIs for our connector ecosystem
  • Event-driven architectures for real-time metadata processing
  • Schema mapping between diverse systems and DataHub's unified model
  • Versioning systems for AI assets (training data, model weights, embeddings)
You Have
  • 4+ years building production-grade distributed systems
  • Advanced Python expertise with a focus on API design
  • Experience with high-scale data processing or integration frameworks
  • Strong systems knowledge and distributed architecture experience
  • A track record of solving complex technical challenges
Bonus Points
  • Experience with DataHub or similar metadata/ETL frameworks (Airflow, Airbyte, dbt)
  • Open-source contributions
  • Early-stage startup experience
Location and Compensation

Bay Area (hybrid, 3 days in Palo Alto office)

Salary Range: $225,000 to $300,000

Benefits and Perks

We invest in people so they can do their best work and enjoy doing it. Our benefits reflect the way we build: practical, thoughtful, and designed to support long-term growth.

Competitive compensation

We offer salaries that reflect your skills, experience, and the impact you make. You bring value—we make sure you're recognized for it.

Equity for everyone

Every team member receives an ownership stake in the company. When we grow, you grow with us.

Remote Work

All roles are remote unless otherwise specified in the job description. Review the job description to confirm if the role you are interested in is remote or hybrid.

Location flexibility

Home office, coworking space, or something in between? We support your ideal setup. You’ll receive a monthly coworking stipend to use whenever you need a change of pace or in-person collaboration time.

Comprehensive health coverage

Your well-being matters. We cover 99% of medical, dental, and vision premiums employees, and 65% for dependents.

Flexible savings accounts

We offer FSAs to help cover planned or unexpected healthcare costs. You can also opt into a Dependent Care FSA to support family needs.

Support for every path to parenthood

Through Carrot Fertility, we provide inclusive fertility benefits and family-forming support. All U.S. employees have access, regardless of age, gender identity, or family structure.

Time off that works for you

We trust you to take the time you need. Our unlimited PTO and sick leave policy is designed for flexibility, rest, and real life.


Why Join Us

DataHub is at a rare inflection point: we’ve achieved product-market fit, earned the trust of leading enterprises, and secured backing from top-tier investors like Bessemer Venture Partners and 8VC. The context platform market is expected to grow from $1B to $9B in the next five years—and we’re leading the way.

By joining our team, you’ll:

  • Tackle high-impact challenges at the heart of enterprise AI infrastructure
  • Ship production systems that power real-world use cases at global scale
  • Collaborate with a high-caliber team of builders who’ve scaled some of the most influential data tools in the world
  • Build the next generation of AI-native data systems, including conversational agents, intelligent classification, automated governance, and more

If you're passionate about technology, enjoy working with customers, and want to be part of a fast-growing company changing the industry, we want to hear from you!

Top Skills

Airbyte
Airflow
Api Design
Dbt
Distributed Systems
Etl Frameworks
Metadata Management
Python
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Mountain View, CA
33 Employees
Year Founded: 2021

What We Do

Founded by the leaders that built data teams at LinkedIn and Airbnb, Acryl Data enables you to take back control of your fragmented data stack. We do this by driving the #1 open source Metadata Platform DataHub, which has a community of 8,000+ data practitioners and is deployed in 1,000+ companies.

Acryl DataHub is a third-generation streaming metadata platform that integrates with 50+ tools (dbt, Kafka, Snowflake, Airflow, Looker, etc) in the data stack to enable data discovery, data lineage, data governance, and data observability.

✅ Connect to your data sources within minutes, and gain end-to-end visibility.
✅ Power mission-critical workflows with a SOC-2-compliant platform.
✅ Bring data and business teams together with a single source of truth to create governed data products.

Powering data teams at Notion, Zendesk, Riskified, and many more!

Similar Jobs

ServiceNow Logo ServiceNow

Staff Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
159K-278K Annually

ServiceNow Logo ServiceNow

Layer 4-7 F5 NGINX - Senior Network Operations Engineer - Federal

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
125K-213K Annually

Wells Fargo Logo Wells Fargo

Teller 30 Hrs Downtown Downey

Fintech • Financial Services
Hybrid
Downey, CA, USA
22-28

Wells Fargo Logo Wells Fargo

Personal Banker East San Diego

Fintech • Financial Services
Hybrid
3 Locations
23-31

Similar Companies Hiring

Perchwell Thumbnail
Software • Real Estate • Mobile • Database • Analytics
New York City, NY
56 Employees
Jellyfish Thumbnail
Software • Productivity • Database • Cloud • Big Data • Automation • Analytics
Boston, MA
220 Employees
Roofr Thumbnail
Software • Payments • Information Technology • Database • Cloud
Toronto, Ontario
207 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account