Data Crawling and Scraping Engineer

Posted 16 Days Ago
New York, NY
Hybrid
Mid level
Big Data • Software • Business Intelligence
In a world rife with complex relationships and hidden risk, we stand as torchbearers of corporate transparency.
The Role
As a Data Crawling and Scraping Engineer, you will design and implement systems to gather and structure external data, supporting product development and data insights while ensuring accuracy and performance.
Summary Generated by Built In

WireScreen is a fast-growing Series A startup building the go-to open source intelligence platform for navigating global supply chains and China-related risk. While China maintains some of the world’s most detailed corporate ownership records, the real challenge is connecting the dots. That’s where we come in—surfacing the networks, relationships, and financial ties behind companies to support national security, compliance, and regulatory oversight.

Backed by Sequoia Capital and Harpoon Ventures, our team includes a two time Pulitzer Prize-winning journalist and senior engineers from Google, Twitter, and Oracle. We launched our product just three years ago and already have strong traction with top-tier government customers—and we’re just getting started. If you're excited to bring transparency to complex global systems, now’s the perfect time to join us.

Check out this blog from our CEO on how WireScreen traced DeepSeek’s origins back to 2023—well before it went mainstream in 2025.

About the role:

As a Data Crawling and Scraping Engineer, you'll play a critical role in acquiring and structuring high-value external data that powers our core products. Your work will fuel our knowledge graph of millions of entities and directly support our mission to deliver transparency and insight into complex global networks. Your knowledge of core web technologies, such as networking, DNS, CAPTCHAs, APIs, WAFs, and proxies will open the vast scale of open source intelligence data for our internal stakeholders and our customers.

You’ll work closely with engineering, research, and product teams to identify new data sources, develop reliable pipelines to gather, ingest, and structure that data, and continuously improve our ability to scale and adapt. You'll have ownership over how information flows into our platform — from design and architecture to reliability and performance — and help shape the systems that underpin our next generation of features and products.

What You'll Do:
  • Design and implement systems to crawl, scrape, extract, and normalize external data from a variety of web-based sources.

  • Collaborate with researchers and analysts to identify new sources of valuable company, economic, and supply chain data and define integration strategies.

  • Build robust, scalable pipelines that ingest structured and semi-structured data into our database.

  • Ensure high levels of accuracy, coverage, and freshness across incoming data streams.

  • Contribute to the evolution of our data platform and internal tooling.

  • Improve system reliability, observability, and performance over time.

You Should Apply If You...
  • Have 3+ years of experience as a backend or full-stack software engineer.

  • Have intimate knowledge of how to crawl the internet at scale.

  • Have strong programming skills, especially in Python.

  • Have experience working with structured and unstructured data from diverse external systems.

  • Are comfortable debugging complex issues involving networking, content rendering, or inconsistent source data.

  • Are proficient with SQL and relational databases.

  • Are a clear communicator who collaborates effectively with both technical and non-technical teammates.

  • Are passionate about turning raw data into meaningful insight, and eager to work on technically nuanced challenges.

Bonus Points If You Have...
  • Familiarity with headless browser automation or techniques for collecting data from dynamic content sources.

  • Expertise in the architure, technologies, and tools that run the modern internet such as DNS, networking, CDNs, WAFs, CAPTCHAs, proxies and reverse proxies.

  • Experience with event-driven architecture.

  • Eagerness to incorporate new technologies and validate their usefulness using structured experiments and thorough testing.

  • Experience building health monitoring and observability tools for consumption by automated tools, engineers, and non-technical stakeholders.

What You'll Love About Wirescreen

At WireScreen, you'll do high-impact work that helps shape global commerce and policy. We’re a mission-driven team with a growth mindset—curious, collaborative, and unafraid to take on bold challenges. You’ll be empowered to act, heard when you speak, and supported as you grow. With strong market momentum and ambitious goals, this is an exciting time to join us and help build something that truly matters.

Benefits & Perks

At WireScreen, we care deeply about our team and are committed to supporting your well-being—both in and out of the workplace. Here’s how we take care of our employees:

  • Competitive compensation including salary, equity, and rapid growth potential

  • 100% company-paid Medical, Dental, and Vision coverage for employees

  • FSA, HSA, and 401(k) options to help you plan for healthcare expenses and retirement

  • Generous paid time off plus company-wide holidays to help you rest and recharge

  • Commuter benefits for NYC and D.C. -based employees

  • Hybrid office schedule for NYC-based and D.C. - based employees

Top Skills

APIs
Captchas
Dns
Networking
Proxies
Python
SQL
Wafs
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
36 Employees
Year Founded: 2019

What We Do

At WireScreen, we’re not just building a corporate intelligence platform, we are reinventing how multinational corporations, governments and law enforcement agencies make critical decisions.

In this era of heightened geopolitical scrutiny, it’s never been more challenging or consequential to understand who you are doing business with. We are building the platform that puts this actionable intelligence just a few clicks away.

Our sophisticated software platform helps shine a light on global supply chains, surface regulatory exposure, and provides insights that drive due diligence and compliance. We eliminate the noise and mundane aspects of corporate data and illuminate the stories and relationships that matter.

Why Work With Us

We’re simultaneously building a powerful global intelligence platform and an award-winning weekly digital news magazine, which reports on one of the biggest stories of our time. Join our team to make an impact and help bring transparency to complex dealings of global businesses. Learn and work alongside talented colleagues on this exciting journey.

Gallery

Gallery

Similar Jobs

Cloudflare Logo Cloudflare

Senior Product Manager

Cloud • Information Technology • Security • Software • Cybersecurity
Hybrid
9 Locations
4400 Employees
170K-234K Annually
Easy Apply
Hybrid
New York, NY, USA
260 Employees
117K-140K

Zocdoc Logo Zocdoc

Staff Software Engineer

Healthtech • Information Technology • Software • Telehealth
Easy Apply
Hybrid
New York, NY, USA
900 Employees
210K-285K Annually

Gusto Logo Gusto

Scientist

Fintech • HR Tech
Easy Apply
Hybrid
4 Locations
2674 Employees
245K-321K

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account