Data Scientist

Posted 2 Days Ago
New York City, NY, USA
In-Office
200K-225K Annually
Mid level
Agency
The Role
Build and deploy production-scale feature engineering pipelines and predictive models over terabytes of consumer data. Own end-to-end path from raw data to reliable features/models consumable by autonomous agents, improve income/wealth and propensity models, and help architect the lakehouse and data infrastructure to support agentic systems.
Summary Generated by Built In

About Minerva

Minerva builds AI for marketing leaders. Our platform allows marketers to focus on telling their brand's story, delegating operationally intensive to our AI agents which handle data management, analytics, campaign generation, measurement, and reporting.

Everything is built on Minerva's proprietary consumer graph, an identity and attribute layer covering 270M+ U.S. consumers across 1,000+ temporal attributes. We have two agentic systems built through an OpenAI research partnership: an Agentic Data Engineer that unifies and standardizes a brand's first party data in hours, and an Agentic Data Scientist that trains robust targeting models at scale. Together, these systems enhance the quality of first party data, increase campaign performance, and give marketing teams back their time.

Our clients include leading consumer brands across categories: the NBA, Ramp, Capital One, Hard Rock Stadium Group / Miami Dolphins, Wander, and Trust & Will. We have raised $20M from The General Partnership, 8VC, Lingotto, NBA Investments, Topology Ventures, Future Positive, Background Capital, and others.

About the Role

As a Data Scientist at Minerva, you build the models and features that power our consumer graph and the agents that run on top of it. You sit at the intersection of heavy data engineering and applied modeling: you architect feature engineering pipelines that are computed over terabytes of data, train and sharpen the models that drive targeting and prediction, and ensure the outputs are robust enough to be consumed autonomously by our Minerva Agents and our world-class modeled attributes (i.e. income / wealth).

This is a role that will be deploying constantly to production. The models you build are not handed off to be deployed by someone else, you own the path from raw data to a feature or model that an agent can call reliably at scale. As we grow, your work becomes the foundation other systems are built on.

What You'll Do

  • Create new features for models and agents, expanding the predictive surface area of our consumer data lake and building the pipelines that turn raw signal into trusted attributes.

  • Improve existing models through rigorous feature engineering, including our income/wealth, home buyer, and home seller models.

  • Play a pivotal role in the buildout of our world-class data lake, shaping how terabytes of consumer data are stored, transformed, and made queryable for both humans and agents.

  • Build feature engineering pipelines that run efficiently at terabyte scale, with the data engineering rigor to make them reliable in production. This is a 70/30 split DS/DE role.

  • Ensure model and feature outputs are reliable enough to be consumed agentically, writing the validations and guardrails that let our agents act on your work without a human in the loop.

Our Data Stack

  • Dagster for all things orchestration

  • dbt-core within Dagster as the primary data transformation surface

  • Spark, Iceberg, Trino, AWS Glue for Lakehouse workloads

  • Modal for ML eng

  • Frontier + OSS models & agent SDKs. We are heavy users of OpenAI/Anthropic batch APIs

Qualifications

  • 2-4+ years working as a data scientist, applied machine learning focused data engineer or software engineer in a data-heavy context. Simply put, you live and breathe data.

  • Highly proficient at Python and SQL.

  • You are driven by first-principles thinking and are a go-getter. You reason about what datasets and features are necessary to solve a modeling problem, and are scrappy and clever enough to bring that to life.

  • Strong intuition for data engineering principles, especially around data cleaning/ingestion and data modeling. We prefer these core skills to be second-nature, freeing up thinking for architecting and executing large-scale data initiatives, especially given the advancement of AI coding tools.

  • Strong engineering background. You are comfortable deploying complicated production pipelines and working within larger production systems, not just in sandboxed or research environments.

  • Willingness to work in office in NYC (we provide a relocation package).

  • Flexibility and openness to wearing several hats. We are lean and things are always changing.

  • Eagerness to learn and grow with the company and your coworkers.

Preferred

  • Experience building and training predictive models (e.g. lead scoring, LTV, propensity, lookalike modeling).

  • Experience with orchestration tools like Dagster, Airflow, Prefect and SQL transformation tools like dbt, SQLMesh.

  • Experience with both transactional databases (e.g. Postgres, MySQL) and analytical databases (e.g. Snowflake, Redshift), with a bias toward the latter.

  • Familiarity with a cloud resource provider (e.g. AWS, GCP).

  • Familiarity with backend and ML/AI engineering.

  • Experience with AI coding tools (e.g. Cursor, Claude Code, OpenCode) as a force multiplier.

  • Prior work at an early-stage startup.

You don't need to tick every box. If you're strong on the engineering side and hungry to build models that matter, we want to hear from you.

Compensation

Base salary: $200,000 to $225,000, commensurate with experience. Competitive equity and a marquee benefits package.

Skills Required

  • 2-4+ years working as a data scientist, applied ML-focused data engineer, or software engineer in a data-heavy context
  • Highly proficient in Python
  • Highly proficient in SQL
  • Strong intuition for data engineering principles (data cleaning, ingestion, data modeling)
  • Strong engineering background; comfortable deploying production pipelines and working within larger production systems
  • Experience building feature engineering pipelines and models that run reliably at terabyte scale
  • Ability to write validations and guardrails so models/features can be consumed autonomously by agents
  • Willingness to work in-office in NYC (relocation package provided)
  • Flexibility and openness to wearing several hats at an early-stage company
  • Eagerness to learn and grow with the company
  • Experience building and training predictive models (lead scoring, LTV, propensity, lookalike)
  • Experience with orchestration tools (Dagster, Airflow, Prefect) and SQL transformation tools (dbt, SQLMesh)
  • Experience with transactional and analytical databases (Postgres, MySQL, Snowflake, Redshift)
  • Familiarity with cloud providers (AWS, GCP)
  • Familiarity with backend and ML/AI engineering and ML deployment tooling (Modal, agent SDKs, OpenAI/Anthropic APIs)
  • Experience with AI coding tools (Cursor, Claude Code, OpenCode) and prior early-stage startup experience
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Denver, CO
5 Employees
Year Founded: 2015

What We Do

Our team provides virtual CTOs that specialize in creating both back-end and front-end applications that scale. We have worked on a wide range of enterprise applications. Whether you are looking to go from one thousand customers to one million, or get your technology to a state where your company can be funded.

Similar Jobs

Snap Inc. Logo Snap Inc.

Data Scientist

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Hybrid
6 Locations
5000 Employees
133K-235K Annually

Capital One Logo Capital One

Data Scientist

Fintech • Machine Learning • Payments • Software • Financial Services
Hybrid
2 Locations
55000 Employees
269K-335K Annually

Capital One Logo Capital One

Data Scientist

Fintech • Machine Learning • Payments • Software • Financial Services
Hybrid
2 Locations
55000 Employees
136K-169K Annually

Capital One Logo Capital One

Data Scientist

Fintech • Machine Learning • Payments • Software • Financial Services
Hybrid
3 Locations
55000 Employees
179K-246K Annually

Similar Companies Hiring

Caxy Thumbnail
Software • Mobile • Enterprise Web • Artificial Intelligence • Agency
Chicago, IL
45 Employees
Digible Thumbnail
Social Media • PropTech • Marketing Tech • Digital Media • Artificial Intelligence • Agency • AdTech
PH
145 Employees
Fora Thumbnail
Agency • On-Demand • Professional Services • Sales • Software • Travel • Hospitality
New York, NY
200 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account