The Role
The role involves scaling data systems, improving data infrastructure, mentoring engineers, and collaborating with stakeholders to enhance data-driven decision making at Plaid.
Summary Generated by Built In
Making data driven decisions is key to Plaid's culture. To support that, we need to scale our data systems while maintaining correct and complete data. We provide tooling and guidance to teams across engineering, product, and business and help them explore our data quickly and safely to get the data insights they need, which ultimately helps Plaid serve our customers more effectively. We build the data and machine learning infrastructure to enable Plaid engineers to prototype and iterate on products and features built on top of consumer-permissioned financial data.
Engineers on Data Infrastructure are domain experts in Data Warehouse, Data Lakehouse, Spark, Workflow Orchestration, and Streaming technologies. We scale our existing data pipelines in a performant and cost efficient way while creating the necessary abstractions to make developing on top of this platform extremely simple for other engineers at Plaid.
Responsibilities
- Contribute towards the long-term technical roadmap for data-driven and machine learning iteration at Plaid
- Leading key data infrastructure projects such as improving ML development golden paths, implementing offline streaming solutions for data freshness, building net new ETL pipeline infrastructure, and evolving data warehouse or data lakehouse capabilities.
- Working with stakeholders in other teams and functions to define technical roadmaps for key backend systems and abstractions across Plaid.
- Debugging, troubleshooting, and reducing operational burden for our Data Platform.
- Growing the team via mentorship and leadership, reviewing technical documents and code changes.
Qualifications
- 5+ years of software engineering experience
- Extensive hands-on software engineering experience, with a strong track record of delivering successful projects within the Data Infrastructure or Platform domain at similar or larger companies.
- Deep understanding of one of: ML Infrastructure systems, including Feature Stores, Training Infrastructure, Serving Infrastructure, and Model Monitoring OR Data Infrastructure systems, including Data Warehouses, Data Lakehouses, Apache Spark, Streaming Infrastructure, Workflow Orchestration.
- Strong cross-functional collaboration, communication, and project management skills, with proven ability to coordinate effectively.
- Proficiency in coding, testing, and system design, ensuring reliable and scalable solutions.
- Demonstrated leadership abilities, including experience mentoring and guiding junior engineers.
- [Nice to have] Experience with Databricks, Airflow, AWS EMR
Top Skills
Airflow
Aws Emr
Data Lakehouse
Data Warehouse
Databricks
Spark
Streaming Technologies
Workflow Orchestration
Am I A Good Fit?
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.
Success! Refresh the page to see how your skills align with this role.
The Company
What We Do
Plaid is used by thousands of digital financial apps and services like Betterment, Expensify, Microsoft and Venmo, and by many of the largest banks to make it easy for consumers to connect their financial accounts with the apps and services they want to use. Plaid connects with over 11,000 financial institutions across the U.S, Canada and Europe.
At Plaid, we have diverse backgrounds and skills, but we're all passionate about building a more efficient and inclusive financial infrastructure—together.