Senior Data Scientist - Big Data R&D, Identity Graph & KYC

Posted 22 Days Ago
Hiring Remotely in United States
Remote
170K-200K Annually
Senior level
Artificial Intelligence • Machine Learning • Software • Analytics
Our mission is to verify 100% of good identities in real-time and completely eliminate identity fraud on the internet.
The Role
As a Senior Data Scientist, you will design and deploy machine learning and graph algorithms to enhance identity trust solutions, own end-to-end projects, and collaborate with product and engineering teams to improve KYC and compliance products.
Summary Generated by Built In
Why Socure?

Socure is building the identity trust infrastructure for the digital economy — verifying 100% of good identities in real time and stopping fraud before it starts. The mission is big, the problems are complex, and the impact is felt by businesses, governments, and millions of people every day.

We hire people who want that level of responsibility. People who move fast, think critically, act like owners, and care deeply about solving customer problems with precision. If you want predictability or narrow scope, this won’t be your place. If you want to help build the future of identity with a team that holds a high bar for itself — keep reading.

About the Role

The Big Data R&D team develops cutting‑edge big data and graph‑based solutions for entity search, entity resolution, and identity matching that power Socure’s KYC and compliance products.

As a Senior Data Scientist I, you will lead the design and deployment of advanced ML and graph algorithms on large-scale PII datasets, own end‑to‑end projects from problem definition through production validation, and serve as a key technical partner to Product, Engineering, and Client‑facing teams. You will help define standards for feature engineering, experimentation, and data quality across our identity graph stack, with substantial impact on coverage, accuracy, and fairness.

What You'll Do
  • Own the design, development, and evaluation of machine learning, statistical, and graph-based algorithms for entity-resolution, identity trust scoring, and anomaly detection on massive datasets.

  • Architect and optimize graph-based identity representations (identity graph structure, linkage rules, clustering) to improve match rates, reduce false positives/negatives, and support downstream fraud and KYC models.

  • Build and maintain scalable data pipelines and feature stores in Spark/PySpark (or Scala), including data normalization, deduplication, and feature computation across large PII datasets in AWS/Databricks environments.

  • Lead A/B tests and offline/online experimentation for new models, features, and data sources; define success metrics, design experiments, and ensure rigorous validation before rollout.

  • Evaluate new internal and external data sources: explore signal quality, design backtests, quantify incremental value, and provide clear recommendations on vendor selection and integration.

  • Partner closely with product managers and engineers to translate ambiguous business and regulatory requirements (e.g., KYC coverage, watchlist matching) into concrete modeling and data roadmaps.

  • Provide deep analytical support to Socure’s compliance and regulatory product suite, including investigative analyses, root‑cause analysis for anomalies, and clear narratives for internal and external stakeholders.

  • Contribute to model governance and documentation: clearly explain model logic, data dependencies, limitations, and monitoring plans to internal risk/compliance stakeholders.

  • Mentor junior data scientists and engineers on best practices in data exploration, feature engineering, experimentation, and code quality.

  • Communicate complex technical concepts and trade‑offs in a concise, structured way to both technical and non‑technical audiences (e.g., product reviews, customer meetings, internal briefings).

What You Bring
  • Master’s degree with 3+ years of relevant industry experience, or Ph.D. with 1+ years of experience in applied ML / data science roles; background in Computer Science, Statistics, Mathematics, or related quantitative fields preferred.

  • Strong proficiency in Python (preferred) or Scala, including experience with ML libraries such as scikit‑learn, XGBoost, TensorFlow or PyTorch.

  • Extensive experience with Spark or PySpark and distributed data systems (e.g., AWS EMR, Databricks) working on very large, messy datasets.

  • Deep understanding of supervised and unsupervised learning, feature engineering, model evaluation, and experiment design (A/B testing, holdout strategies, stratification).

  • Experience developing production-quality data pipelines and automated workflows using Airflow or similar orchestration tools.

  • Practical familiarity with graph databases and/or graph frameworks (Neo4j, AWS Neptune, GraphFrames, DGL, PyTorch Geometric) and graph algorithms for clustering, link prediction, and community detection is strongly preferred.

  • Solid SQL skills and experience working with large-scale analytical data stores.

  • Experience in at least one of: identity verification, fraud detection, credit risk, or adjacent high‑stakes domains is a plus.

  • Demonstrated ability to lead medium‑to‑large projects end‑to‑end, make sound trade‑off decisions under ambiguity, and influence cross‑functional stakeholders with data and clear reasoning.

Please note that sponsorship is not available at this time; and that you must be located within 45 miles of a talent hub to be considered.

Socure is an equal opportunity employer that values diversity in all its forms within our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
If you need an accommodation during any stage of the application or hiring process—including interview or onboarding support—please reach out to your Socure recruiting partner directly.

Follow Us!

YouTube | LinkedIn | X (Twitter) | Facebook

Skills Required

  • Master's degree with 3+ years of relevant industry experience, or Ph.D. with 1+ years of experience in applied ML / data science roles
  • Strong proficiency in Python or Scala
  • Extensive experience with Spark or PySpark and distributed data systems
  • Deep understanding of supervised and unsupervised learning, feature engineering, model evaluation, and experiment design
  • Experience developing production-quality data pipelines and automated workflows using Airflow
  • Practical familiarity with graph databases and/or graph frameworks
  • Solid SQL skills and experience working with large-scale analytical data stores
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Chennai, Tamil Nadu
386 Employees
Year Founded: 2012

What We Do

Socure is the leading platform for digital identity trust. Its predictive analytics platform applies artificial intelligence and machine learning techniques with trusted online/offline data intelligence from email, phone, address, IP, device, velocity, and the broader internet to verify identities in real time. The company has more than 750 customers across the financial services, gaming, telecom, and e-commerce industries, including three of the top five banks, seven of the top 10 card issuers, three of the top MSBs, the top payroll provider, the top credit bureau, and over 100 of the largest and most successful FinTechs. Marquee customers include Chime, Varo Money, Public, Stash, and DraftKings. Socure has received numerous industry awards and accolades, including being named to Forbes America’s Best Startup Employers 2021, being awarded Best New Technology Introduced over the Last 12 Months – Data and Data Services at the 2020 American Financial Technology Awards (AFTAs), being ranked number 70 in Deloitte’s Technology Fast 500™, being listed as a Gartner Cool Vendor, being recognized by Forbes as one of the Top 25 Machine Learning Startups to Watch, being named to CB Insights: The FinTech 250, and being awarded Finovate’s Award for Best Use of AI/ML, to name a few.

Why Work With Us

Socure is a critical part of the infrastructure of the digital economy and what we do is critical to ensure the safety of anyone doing any sort of business on the internet. Because of our technology digital identity theft will be eradicated and more people will be included in the digital economy than ever before.

Gallery

Gallery

Similar Jobs

Capital One Logo Capital One

Lead Software Engineer

Fintech • Machine Learning • Payments • Software • Financial Services
Remote or Hybrid
McLean, VA, USA
55000 Employees
209K-262K Annually

Capital One Logo Capital One

Work From Home Dealer Lien Perfection Sr. Coordinator

Fintech • Machine Learning • Payments • Software • Financial Services
Remote or Hybrid
Plano, TX, USA
55000 Employees
50K-50K Annually

Trail of Bits Logo Trail of Bits

Security Engineer

Artificial Intelligence • Blockchain • Professional Services • Security • Consulting • Cybersecurity • Defense
Remote
United States
125 Employees
100K-200K Annually

Leader Bank Logo Leader Bank

Business Development Manager

Fintech • Insurance • Payments • Social Impact • Financial Services
Remote or Hybrid
United States
420 Employees
72K-108K Annually

Similar Companies Hiring

Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York City, NY
100 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account