Senior Data Platform Engineer

Posted 2 Hours Ago
Be an Early Applicant
New York City, NY, USA
Hybrid
176K-200K Annually
Senior level
Artificial Intelligence
Pinecone, who pioneered the vector database category, is helping advance and scale AI solutions for a better future.
The Role
Design, build, and operate the data platform: ingestion, transformation (DBT), orchestration (Airflow on Kubernetes), and metrics. Manage BigQuery/GKE/Cloud Run/Kafka costs and performance. Lead company-level analyses, partner with finance/GTM/product, ship BI dashboards, enable self-serve analytics, and establish AI-assisted data workflows and best practices.
Summary Generated by Built In
About Pinecone

Pinecone is the leading vector database for building accurate and performant AI applications at scale in production. Pinecone’s mission is to make AI knowledgeable. More than 9000 customers across various industries have shipped AI applications faster and more confidently with Pinecone’s developer-friendly technology. Pinecone is based in New York and raised $138M in funding from Andreessen Horowitz, ICONIQ, Menlo Ventures, and Wing Venture Capital.

About the Role

Pinecone is looking for a Senior Data Engineer to own and grow the systems that power how we understand our business. You will design and operate the ingest, transform, orchestration, and metrics layers that feed analysts, executives, and the Board, and you will lead the analyses themselves when the question matters enough. This is a high-ownership role on a small team, with direct exposure to finance, GTM, product, and the executive staff.

Responsibilities
  • Own and build the ingestion layer. Design, deploy, and scale pipelines that pull from third-party APIs, internal services, and SaaS tools into BigQuery. Add new sources as the business demands.

  • Own and build the transform layer. Develop and maintain our DBT project, including staging, intermediate, and marts. Maintain core business datasets: users, organizations, indexes, accounts, usage, revenue. Write tests, snapshots, and documentation. Drive data quality and trust.

  • Own and build the orchestration platform. Operate the Airflow-on-Kubernetes environment that runs our ingest and DBT workloads. Improve reliability, scalability, observability, and CI/CD.

  • Establish and maintain the business-context and metrics layer. Curate metric definitions and documentation that feed both human analysts and agents.

  • Manage infrastructure cost and performance. Manage BigQuery, GKE, Cloud Run, and Kafka costs, right-size compute, and make sure the platform stays efficient.

  • Lead and own mission-critical company-level analyses. Partner with finance, GTM, product, and exec stakeholders to answer business questions, design metrics, run experiments and evaluations, build views in BI tools, and ship dashboards that support key business decisions as well as regular reporting to the Board of Directors.

  • Enable other teams to self-serve. Onboard analysts and non-DE stakeholders onto the warehouse, help them with best practices, and create reusable models and tooling.

  • Set the standard for AI-assisted data workflow. Establish best AI practices and patterns that enable a small data team to operate with outsized leverage.

Qualifications
  • 4+ years building and operating data pipelines in production.

  • Strong SQL, with comfort in BigQuery (or Snowflake/Redshift) writing non-trivial analytical queries, optimizing performance, and reasoning about correctness.

  • Strong coding skills, with comfort writing ETL/rETL, consuming services and integrations against REST/GraphQL APIs, and producing clean code that others can reuse and maintain.

  • Experience with a modern orchestrator (Airflow, Dagster, Prefect, or similar) running containerized workloads.

  • Comfort with Docker, Kubernetes, and modern cloud infrastructure best practices.

  • Experience integrating systems, pulling data between APIs, databases, and warehouses; handling auth, pagination, schema drift, and incremental loads.

  • Hands-on experience using AI coding tools (Claude Code, Cursor, or similar) as part of your workflow.

  • Ability to design, build, and own systems end-to-end in a highly autonomous environment.

Nice to Have
  • Production DBT experience: layered models, tests, snapshots, macros, deferred builds.

  • Experience working with a semantic layer, metrics layer (DBT Semantic Layer, Cube, LookML).

  • Comfortable with exploratory analysis, designing experiments and A/B tests, basic statistical modeling, and separating signal from noise in messy data.

  • Exposure to building AI agents or applications.

  • Infrastructure-as-code (Terraform, Pulumi, or similar).

Perks & Benefits
  • Comprehensive health coverage including medical, dental, vision, and mental health resources

  • 401(k) Plan

  • Equity award

  • Flexible time off

  • Paid parental leave

  • Annual Company Event

  • WFH Equipment Stipend

All qualified applicants will receive considerations for employment without regard to race, color, religion, sex, age, disability, marital status, familial status, sexual orientation, pregnancy, gender identity, gender expression, national origin, ancestry, citizenship status, veteran status, and any other legally protected status under federal, state, or local anti-discrimination laws.

Skills Required

  • 4+ years building and operating data pipelines in production.
  • Strong SQL with comfort in BigQuery (or Snowflake/Redshift) for analytical queries and performance optimization.
  • Strong coding skills for ETL/rETL, consuming REST/GraphQL APIs, and producing maintainable code.
  • Experience with a modern orchestrator (Airflow, Dagster, Prefect) running containerized workloads.
  • Comfort with Docker, Kubernetes, and modern cloud infrastructure best practices.
  • Experience integrating systems: handling auth, pagination, schema drift, and incremental loads between APIs, DBs, and warehouses.
  • Hands-on experience using AI coding tools (Claude Code, Cursor, or similar) as part of your workflow.
  • Ability to design, build, and own systems end-to-end in a highly autonomous environment.
  • Production DBT experience: layered models, tests, snapshots, macros, deferred builds.
  • Experience with semantic/metrics layers (DBT Semantic Layer, Cube, LookML).
  • Comfortable with exploratory analysis, experiment/A-B design, and basic statistical modeling.
  • Exposure to building AI agents or applications.
  • Infrastructure-as-code (Terraform, Pulumi, or similar).
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York , NY
125 Employees
Year Founded: 2019

What We Do

Pinecone stands as a leader in the AI industry, dedicated to transforming artificial intelligence and tackling its most pressing challenges head-on. Our mission revolves around augmenting the capabilities of AI models, addressing the issue of AI hallucinations, and ensuring the delivery of accurate and dependable results. Pinecone specializes in vector database technology, a groundbreaking innovation that has reshaped data management for AI applications. Our vector database solutions are designed to help eliminate AI hallucinations, making AI outputs reliable and trustworthy. Moreover, Pinecone's technology streamlines data retrieval, minimizes memory overhead, and optimizes AI operations, crucial for the growing demand for Generative AI (GenAI). With strategic partnerships with industry giants like OpenAI, Google Cloud Platform, AWS, and others, Pinecone plays a central role in advancing the AI landscape. Beyond technology, we foster knowledge sharing through events and educational initiatives, empowering AI engineers and leaders with actionable insights. Pinecone is not just transforming AI; we are helping pave the way for a more dependable and scalable AI future.

Why Work With Us

At Pinecone, our culture emphasizes camaraderie, collaboration, and inclusivity. Deeply rooted in our DNA, we prioritize these values. In the dynamic ML/AI realm, we strive for top engineering solutions, while ensuring a shift to AI that aligns with genuine human values.

Gallery

Gallery

Similar Jobs

Samsara Logo Samsara

Senior Software Engineer

Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Easy Apply
Remote or Hybrid
United States
4000 Employees
131K-220K Annually

Gusto Logo Gusto

Senior Software Engineer

Fintech • HR Tech
Easy Apply
Remote or Hybrid
5 Locations
4405 Employees
163K-247K Annually

Datadog Logo Datadog

Senior Software Engineer

Artificial Intelligence • Cloud • Security • Software • Cybersecurity
Easy Apply
Hybrid
New York, NY, USA
6500 Employees
130K-300K Annually

Hinge Logo Hinge

Platform Engineer

Artificial Intelligence • Machine Learning • Mobile • Social Impact • Software • App development
Hybrid
New York, NY, USA
305 Employees
100K-223K Annually

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account