Data Scientist — Blockchain Intelligence

Posted 6 Days Ago
Be an Early Applicant
Hiring Remotely in Belgium
Remote
Mid level
Cryptocurrency
The Role
Design, test, and deploy clustering and attribution heuristics across large on-chain datasets; build end-to-end data pipelines, validate metrics, investigate edge cases, and partner with investigations and product to benchmark and productionize models.
Summary Generated by Built In
⚡️ About Merkle Science
Merkle Science provides blockchain transaction monitoring and intelligence solutions for web3 companies, digital asset service providers, financial institutions, law enforcement and government agencies to detect, investigate, and prevent illicit use of cryptocurrencies. Our vision is to make cryptocurrencies safe and provide infrastructure for the safe and compliant growth of cryptocurrencies.

Merkle Science is headquartered in New York with offices in Singapore, Bangalore and London. The team has combined experience across Bank of America, Paypal, Luno, Thomson Reuters and Amazon. The company has raised over $27M from SIG, Beco, Republic, DCG, Kenetic, GGV and several others.

About the role

We turn raw on-chain activity into trustworthy intelligence — clustering addresses into real-world entities, attributing them to services and actors, and surfacing risk for compliance and investigations teams. We're looking for a data scientist who is as comfortable shipping a heuristic to production as they are designing it: someone who can move from a messy hypothesis to a working pipeline without waiting on someone else to wire up the data.

You'll work closely with our attribution and clustering leads on models and heuristics that run across billions of transactions and multiple chains (Bitcoin, Ethereum, Tron, Solana, and more).

What you'll do
  • Design, test, and ship clustering and attribution heuristics, and measure them with real precision/coverage metrics rather than vibes.

  • Own your data end to end — pull, clean, join, and model large on-chain datasets without depending on a separate team for every query.

  • Build and maintain the pipelines that take a heuristic from notebook to production, including backfills, incremental runs, and validation.

  • Investigate edge cases (mixers, bridges, exchange hot wallets, consolidation patterns) and translate findings into repeatable logic.

  • Partner with investigations and product to define what "correct" looks like and benchmark against ground truth.

  • Prototype quickly, then harden what works.

What we're looking for
  • 4+ years building data science or data engineering systems that actually shipped (not just notebooks).

  • Strong Python and SQL; comfortable with large datasets and the gotchas of joins, dedup, and skew at scale.

  • Solid grasp of clustering, graph/network analysis, or entity resolution — and a habit of validating results, not just producing them.

  • Ability to reason about precision vs. coverage trade-offs and defend your metrics.

  • Self-directed: you can scope an ambiguous problem, get the data yourself, and drive it to a result.

Our tech stack

You don't need to have used all of these, but here's what you'd be working with day to day:

  • Databricks — our lakehouse and processing backbone. Large-scale on-chain datasets are transformed and modeled here via Spark and SQL; most heuristics run as Databricks jobs against billions of transactions.

  • Kafka — real-time ingestion of on-chain and transaction data. New blocks and events stream in continuously, so a lot of our work is designed to run incrementally rather than as one-off batch jobs.

  • Python — the primary language for everything from exploratory analysis to production heuristics and pipeline code.

  • TigerGraph — our graph database, where addresses, transactions, and entities live as a network. Clustering, traversals, and relationship queries (who funds whom, consolidation paths, entity linkage) happen here.

Supporting cast you'll likely touch:

  • SQL everywhere — for ad-hoc analysis, validation, and defining ground-truth datasets.

  • Columnar / analytical stores (e.g., ClickHouse) for fast aggregate queries over large tables.

  • Orchestration & scheduling for backfills and recurring pipeline runs.

  • Git / GitHub for version control and code review — we expect pipelines and heuristics to be reviewed like any other code.

  • GCP as our cloud environment.

How we work

Small, high-trust team. You'll have a lot of ownership and very little bureaucracy. We prototype fast, measure honestly, and ship.


❤️ Well Being, Compensation and Benefits
We care about your well-being. Along with excellent health insurance, we offer flexible time off, learning & development initiatives and hours that are designed to provide work/life balance.  We regularly host team-building sessions and encourage discussions around mental health.  

We reward talent and believe in acknowledging people for their contributions.  We offer industry-leading compensation, along with generous equity.  As a rapidly growing business, there are endless opportunities to grow your career with Merkle Science.

Skills Required

  • 4+ years building data science or data engineering systems that shipped to production
  • Strong Python
  • Strong SQL and experience with large datasets, joins, deduplication, and skew at scale
  • Experience designing, testing, and shipping clustering, attribution, or entity-resolution heuristics
  • Familiarity with clustering, graph/network analysis, or entity resolution and validating results
  • Ability to reason about precision versus coverage trade-offs and defend metrics
  • Self-directed: scope ambiguous problems, pull and prepare data, drive to production results
  • Experience building and maintaining production pipelines (backfills, incremental runs, validation)
  • Experience with Databricks and Spark
  • Experience with Kafka for real-time ingestion
  • Experience with graph databases (e.g., TigerGraph) and graph traversals
  • Familiarity with columnar/analytical stores (e.g., ClickHouse)
  • Familiarity with GCP
  • Familiarity with Git/GitHub and code review for pipelines
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
68 Employees
Year Founded: 2018

What We Do

Founded in 2018, Merkle Science is the next generation predictive cryptocurrency risk and intelligence platform that helps crypto companies, financial institutions, and government entities detect, investigate, and prevent illegal activities involving cryptocurrencies. Merkle Science’s proprietary Behavioral Rule Engine enables our tools to go beyond blacklists so that compliance teams may fulfill their local KYC/AML obligations and industry players may stay keep pace with the industry’s increasingly complex illicit activities. Merkle Science envisions a world powered by crypto and is creating the infrastructure necessary to ensure the safe and healthy growth of the cryptocurrency industry as it becomes a key pillar of the $22 trillion financial services ecosystem. We enable businesses to scale and mature so that a full range of individuals, entities, and services may transact with crypto safely.

Similar Jobs

Zapier Logo Zapier

Staff Engineer

Artificial Intelligence • Productivity • Software • Automation
Remote
32 Locations
800 Employees
211K-316K Annually

SEON Logo SEON

Senior Site Reliability Engineer

Artificial Intelligence • Cybersecurity
In-Office or Remote
28 Locations
415 Employees

Zapier Logo Zapier

Systems Engineer

Artificial Intelligence • Productivity • Software • Automation
Remote
27 Locations
800 Employees

Deepgram Logo Deepgram

Research Staff, LLMs

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
In-Office or Remote
49 Locations
150 Employees
150K-250K Annually

Similar Companies Hiring

Bitnomial Thumbnail
Web3 • Software • Fintech • Financial Services • Cryptocurrency • Blockchain
Chicago, IL
26 Employees
Block Thumbnail
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
Oakland, CA
12000 Employees
Rain Thumbnail
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3 • Infrastructure as a Service (IaaS)
New York, NY
100 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account