Web Scraping Engineer

Job Posted 3 Hours Ago Posted 3 Hours Ago
Be an Early Applicant
Hiring Remotely in India
Remote
Mid level
Big Data • eCommerce
The Role
Design, refactor, and maintain web scrapers, implement advanced scraping techniques, collaborate with teams, monitor performance, and drive continuous improvement.
Summary Generated by Built In

About YipitData:

YipitData is the leading market research and analytics firm for the disruptive economy and recently raised up to $475M from The Carlyle Group at a valuation over $1B.

We analyze billions of alternative data points every day to provide accurate, detailed insights on ridesharing, e-commerce marketplaces, payments, and more. Our on-demand insights team uses proprietary technology to identify, license, clean, and analyze the data many of the world’s largest investment funds and corporations depend on.

For three years and counting, we have been recognized as one of Inc’s Best Workplaces. We are a fast-growing technology company backed by The Carlyle Group and Norwest Venture Partners. Our offices are located in NYC, Austin, Miami, Denver, Mountain View, Seattle, Hong Kong, Shanghai, Beijing, Guangzhou, and Singapore. We cultivate a people-centric culture focused on mastery, ownership, and transparency.

Why You Should Apply NOW:

  • High Impact: Your work will directly influence key reports and strategic decisions across multiple business units.
  • Exciting Challenges: Tackle the design of resilient web scrapers, navigate dynamic website structures, and optimize large-scale data extraction.
  • Growth Opportunities: As an early member of our expanding Data Solutions team, you will have significant input on our strategies, processes, and team culture.

About The Role:

We are seeking a Web Scraping Engineer [Official, Internal Title: Data Solutions Engineer] to join our growing Data Solutions team. Reporting directly to the Data Solutions Engineering Manager, you will play a pivotal role in designing, refactoring, and maintaining the web scrapers that power critical reports across our organization. Your contributions will ensure our data ingestion processes are resilient, efficient, and scalable, directly supporting multiple business units and products.

As Our Data Solutions Engineer You Will:

Refactor and Maintain Web Scrapers

  • Overhaul existing scraping scripts to improve reliability, maintainability, and efficiency.
  • Implement best coding practices (clean code, modular architecture, code reviews, etc.) to ensure quality and sustainability.

Implement Advanced Scraping Techniques

  • Utilize sophisticated fingerprinting methods (cookies, headers, user-agent rotation, proxies) to avoid detection and blocking.
  • Handle dynamic content, navigate complex DOM structures, and manage session/cookie lifecycles effectively.

Collaborate with Cross-Functional Teams

  • Work closely with analysts and other stakeholders to gather requirements, align on targets, and ensure data quality.
  • Support internal users of our web scraping tooling by providing troubleshooting, documentation, and best practices to ensure efficient data usage for critical reporting.

Monitor and Troubleshoot

  • Develop robust monitoring solutions, alerting frameworks to quickly identify and address failures.
  • Continuously evaluate scraper performance, proactively diagnosing bottlenecks and scaling issues.

Drive Continuous Improvement

  • Propose new tooling, methodologies, and technologies to enhance our scraping capabilities and processes.
  • Stay up to date with industry trends, evolving bot-detection tactics, and novel approaches to web data extraction.

This is a fully-remote opportunity based in India. Standard work hours are from 11am to 8pm IST, but there is flexibility here.

You Are Likely To Succeed If:

  • Effective communication in English with both technical and non-technical stakeholders.
  • 4+ years of experience with web scraping frameworks (e.g., Selenium, Playwright, or Puppeteer).
  • Strong understanding of HTTP, RESTful APIs, HTML parsing, browser rendering, and TLS/SSL mechanics.
  • Expertise in advanced fingerprinting and evasion strategies (e.g., browser fingerprint spoofing, request signature manipulation).
  • Deep experience managing cookies, headers, session states, and proxy rotations, including the deployment of both residential and data center proxies.
  • Experience with logging, metrics, and alerting to ensure high availability.
  • Troubleshooting skills to optimize scraper performance for efficiency, reliability, and scalability.

What We Offer:

Our compensation package includes comprehensive benefits, perks, and a competitive salary: 

  • We care about your personal life and we mean it. We offer vacation time, parental leave, team events, learning reimbursement, and more!
  • Your growth at YipitData is determined by the impact that you are making, not by tenure, unnecessary facetime, or office politics. Everyone at YipitData is empowered to learn, self-improve, and master their skills in an environment focused on ownership, respect, and trust.

We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, gender, gender identity or expression, or veteran status. We are proud to be an equal-opportunity employer.

Job Applicant Privacy Notice

Top Skills

Html Parsing
HTTP
Playwright
Puppeteer
Restful Apis
Selenium
Tls/Ssl
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
470 Employees
Hybrid Workplace
Year Founded: 2013

What We Do

YipitData is the leading market research firm for the disruptive economy.

We analyze billions of data points every day to provide accurate, detailed insights on ridesharing, e-commerce marketplaces, payments and more. Our on-demand insights team uses proprietary technology to identify, license, clean and analyze the data many of the world’s largest investment funds and corporations depend on.

YipitData is hiring. Come join the future of data-driven market research: yipitdata.com/careers

Similar Jobs

JumpCloud Logo JumpCloud

Senior Software Engineer, Growth Engineering - India

Cloud • Information Technology • Security • Software
Easy Apply
Remote
3 Locations
800 Employees

Atlassian Logo Atlassian

Senior Data Scientist - Machine Learning

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
Bengaluru, Karnataka, IND
11000 Employees

Capco Logo Capco

Production Support - Axiom

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Remote
Hybrid
India
6000 Employees

Motive Logo Motive

Embedded Engineer Telematics

Artificial Intelligence • Fintech • Hardware • Information Technology • Sales • Software • Transportation
Easy Apply
Remote
India
3600 Employees

Similar Companies Hiring

Monte Carlo Thumbnail
Software • Generative AI • Cloud • Big Data Analytics • Big Data
San Francisco, CA
173 Employees
Hex Thumbnail
Software • Business Intelligence • Big Data Analytics • Big Data • Artificial Intelligence • Analytics
San Francisco, CA
100 Employees
MassMutual India Thumbnail
Insurance • Information Technology • Fintech • Financial Services • Big Data
Hyderabad, Telangana
Not Eligible
Save
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account