DataHub is an AI & Data Context Platform adopted by over 3,000 enterprises, including Apple, CVS Health, Netflix, and Visa. Innovated jointly with a thriving open-source community of 13,000+ members, DataHub's metadata graph provides in-depth context of AI and data assets with best-in-class scalability and extensibility.
The company's enterprise SaaS offering, DataHub Cloud, delivers a fully managed solution with AI-powered discovery, observability, and governance capabilities. Organizations rely on DataHub solutions to accelerate time-to-value from their data investments, ensure AI system reliability, and implement unified governance, enabling AI & data to work together and bring order to data chaos.
In this role, you will
- Enhance the Python-based ingestion framework to support ingesting usage statistics, lineage, and operational metadata from systems like Snowflake, Redshift, Kafka, & more!
- Build connectors for major systems in the modern data and ML stacks
- Enable the ingestion framework to run in a cloud native environment
Requirements:
- Minimum 4 years of engineering experience
- Expertise in Python
- Familiarity with tools in the modern data and ML ecosystem
- Knowledge of distributed systems
- Ability to design for scale and fault tolerance
We invest in people so they can do their best work and enjoy doing it. Our benefits reflect the way we build: practical, thoughtful, and designed to support long-term growth.
Competitive compensation
We offer salaries that reflect your skills, experience, and the impact you make. You bring value—we make sure you're recognized for it.
Equity for everyone
Every team member receives an ownership stake in the company. When we grow, you grow with us.
Remote Work
All roles are remote unless otherwise specified in the job description. Review the job description to confirm if the role you are interested in is remote or hybrid.
Location flexibility
Home office, coworking space, or something in between? We support your ideal setup. You’ll receive a monthly coworking stipend to use whenever you need a change of pace or in-person collaboration time.
Comprehensive health coverage
Your well-being matters. We cover 99% of medical, dental, and vision premiums employees, and 65% for dependents.
Flexible savings accounts
We offer FSAs to help cover planned or unexpected healthcare costs. You can also opt into a Dependent Care FSA to support family needs.
Support for every path to parenthood
Through Carrot Fertility, we provide inclusive fertility benefits and family-forming support. All U.S. employees have access, regardless of age, gender identity, or family structure.
Time off that works for you
We trust you to take the time you need. Our unlimited PTO and sick leave policy is designed for flexibility, rest, and real life.
Why Join Us
DataHub is at a rare inflection point: we’ve achieved product-market fit, earned the trust of leading enterprises, and secured backing from top-tier investors like Bessemer Venture Partners and 8VC. The context platform market is expected to grow from $1B to $9B in the next five years—and we’re leading the way.
By joining our team, you’ll:
- Tackle high-impact challenges at the heart of enterprise AI infrastructure
- Ship production systems that power real-world use cases at global scale
- Collaborate with a high-caliber team of builders who’ve scaled some of the most influential data tools in the world
- Build the next generation of AI-native data systems, including conversational agents, intelligent classification, automated governance, and more
If you're passionate about technology, enjoy working with customers, and want to be part of a fast-growing company changing the industry, we want to hear from you!
Top Skills
What We Do
Founded by the leaders that built data teams at LinkedIn and Airbnb, Acryl Data enables you to take back control of your fragmented data stack. We do this by driving the #1 open source Metadata Platform DataHub, which has a community of 8,000+ data practitioners and is deployed in 1,000+ companies.
Acryl DataHub is a third-generation streaming metadata platform that integrates with 50+ tools (dbt, Kafka, Snowflake, Airflow, Looker, etc) in the data stack to enable data discovery, data lineage, data governance, and data observability.
✅ Connect to your data sources within minutes, and gain end-to-end visibility.
✅ Power mission-critical workflows with a SOC-2-compliant platform.
✅ Bring data and business teams together with a single source of truth to create governed data products.
Powering data teams at Notion, Zendesk, Riskified, and many more!