Data Scientist - Gen AI / QA - Remote

Sorry, this job was removed at 12:02 a.m. (CST) on Saturday, Jul 19, 2025
Be an Early Applicant
Hiring Remotely in Córdoba, Capital, Córdoba
In-Office or Remote
Information Technology • Software • Database
The Role
Description

About Us

At Zyte, we eat data for breakfast and you can eat your breakfast anywhere and work for Zyte. Founded in 2010, we are a globally distributed team of over 250 Zytans working from over 28 countries who are on a mission to enable our customers to extract the data they need to continue to innovate and grow their businesses. We believe that all businesses deserve a smooth pathway to data

For more than a decade, Zyte has led the way in building powerful, easy-to-use tools to collect, format, and deliver web data, quickly, dependably, and at scale. And today, the data we extract helps thousands of organizations make smarter business decisions, secure competitive advantage, and drive sustainable growth. Today, over 3,000 companies and 1 million developers rely on our tools and services to get the data they need from the web.

Data QA is an important function within Zyte. The Data QA team works to ensure that the quality and usability of the data scraped by our web scrapers meets and exceeds the expectations of our enterprise clients. 

Are you passionate about data and data quality and integrity?

Do you enjoy using Python and AI to analyze and manipulate data, detect data quality issues, and visualize your findings?

Are you highly customer-focused with excellent attention to detail?

Owing to growing business and the need for ever more sophisticated Data QA, we are looking for a talented Data Scientist to join our team. As a Zyte Engineer, you work on AI-based data wrangling, data manipulation, and data visualisation techniques and apply them in the verification and validation of data quality as it pertains to data extracted from the web.

Requirements

Roles & Responsibilities:

  • Understand customer web scraping and data requirements; map these requirements to custom AI-based data quality validation techniques, with a focus on achieving pre-established degrees of data quality and uncovering data quality issues.
  • Draw conclusions about data quality by producing descriptive and evidence-based statistics, summaries, and visualisations.
  • Supplement existing manual QA and schema validation techniques with AI-based data quality verification.
  • Collaborate with developers to further troubleshoot and pinpoint solutions.
  • Present findings and conclusions to stakeholders at various levels (other members of the QA department, developers, project managers, account managers, customers).
  • Write high-quality, well-structured code that is maintainable and extensible.
  • Manage code using GitHub, BitBucket and other version control approaches as applicable.

Requirements:

  • Highly proficient in Python and the PyData stack. Minimum of 3 years (please provide code samples in your application - ideally pertaining to data analysis or Generative AI - via a link to GitHub or other publicly-accessible service).
  • BS degree in Computer Science, Engineering, Mathematics, Statistics or equivalent.
  • Up to speed on the latest advances in Generative  AI particularly as they pertain to process automation, web scraping/parsing, and data quality verification.
  • Comfortable with Prompt Engineering and token/cost optimization. 
  • Familiar with abstraction layers (MCP, Marvin, Langchain etc).
  • Experience coding against the APIs of at least one of the Google, OpenAI, or Anthropic models.
  • Experience in data quality visualization and the visualisation of data quality issues.
  • Ability to work with very large datasets (into the millions of records).
  • Strong knowledge of software QA methodologies, tools, and processes.
  • Excellent level of written and spoken English; confident communicator; able to communicate on both technical and non-technical levels with various stakeholders on all matters of QA.
  • Outstanding attention to detail.

Desired Skills:

  • Prior experience in a Data QA role (where the focus was on verifying data quality, rather than testing application functionality).
  • Familiarity with Jupyter and JupyterLab.
  • Experience building your own dashboards.
  • Experience with Spark, BigQuery, and other big data technologies.
  • Previous remote working experience.
Benefits

As a new Zytan, you will:

Become part of a self-motivated, progressive, multi-cultural team.

Have the freedom and flexibility to work from where you do your best work.

Attend conferences and meet with team members from across the globe.

Work with cutting-edge open source technologies and tools.

Similar Jobs

Mondelēz International Logo Mondelēz International

Project Manager

Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Remote or Hybrid
14 Locations
90000 Employees

Webflow Logo Webflow

Principal Software Engineer

Artificial Intelligence • Cloud • eCommerce • Enterprise Web • Software • Design • Generative AI
Easy Apply
Remote
Argentina
800 Employees
Remote or Hybrid
Argentina
289097 Employees

WeLocalize Logo WeLocalize

Shape the Future of AI — Spanish Talent Hub

Machine Learning • Natural Language Processing
In-Office or Remote
10 Locations
2331 Employees
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Cork
219 Employees
Year Founded: 2010

What We Do

At Zyte, we’re all about empowering data-driven organizations to ethically and accurately collect web data to power their business. With over 14 years experience and our early authorship and ongoing maintenance of Scrapy, we’ve shaped the web scraping industry from Day 1.

We help our clients…

- With easy-to-use ways to collect, format and deliver web data, quickly, dependably and at scale,
- Spend more time gleaning insights from highly accurate, business-critical data, and
- Spend less money on the total cost of ownership in web data extraction.

Zyte API abstracts away a historically disparate web data extraction tech stack into a single tool. Zyte API automates most anti-bot and proxy management, so developers can spend more time on strategy.

Zyte API is a full-stack solution that crawls, unblocks and extracts data in minutes with the power of AI. Developers skip the hassle of creating manual parsing code and extract public data at unlimited scale.

Zyte Data is an expert web data extraction team in your pocket. Our white glove service extracts any web data your business needs, regardless of project size and complexity. This includes a dedicated team and round-the-clock support.

Zyte’s legal team is our backbone and is made up of the leading minds in web data extraction compliance. They stay on top of the ever-changing and opaque laws that loom over the industry. They evaluate compliance risks and inform customers about best practices.

Zyte is certified by and a co-founder of the Ethical Web Data Collection Initiative (EWDCI) which recognizes web data providers operating with the highest level of ethical and legal standards.

Come work for us!

We encourage a flexible and diverse work environment, so we embraced the benefits of remote work from our very early beginnings. Our team includes over 200 employees in over 30 countries. All sharing the same drive, to do more with web data.

Similar Companies Hiring

PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account