Forbes Advisor : Data Research Engineer role - AI/ML

Posted 3 Days Ago
Be an Early Applicant
Hiring Remotely in India
Remote
Mid level
Artificial Intelligence • HR Tech • Professional Services • Software
The Role
Design and implement AI/LLM-driven data extraction and ETL pipelines, build web crawling and API integrations, implement RAG systems, mitigate LLM hallucinations, evaluate third-party tools, and collaborate with content and analytics teams to deliver high-quality, scalable data for downstream analysis.
Summary Generated by Built In
Job Title:

Data Research Engineer – Data Extraction Team

Experience:

4+ Years

Location:

Chennai or Remote (India)

About Forbes Advisor

Forbes Advisor is a new initiative under the Forbes Marketplace umbrella that provides journalist- and expert-written insights, news, and reviews on personal finance, health, business, and everyday life decisions.

Our mission is to help readers turn their aspirations into reality by arming them with trusted advice and data-driven insights, enabling them to make confident decisions and focus on what matters most.

The Marketplace team brings decades of industry experience across geographies and functions including Content, SEO, Business Intelligence, Finance, HR, Marketing, Production, Technology, and Sales, with expertise in diverse sectors such as consumer credit, banking, insurance, small business, education, real estate, and travel.

About the Data Extraction Team

The Data Extraction Team plays a crucial role in designing, implementing, and maintaining advanced web scraping frameworks and data pipelines. The team develops methodologies to gather precise, high-quality data from a wide range of digital sources and ensures its seamless integration into internal systems through ETL (Extract, Transform, Load) processes.

This team also explores the integration of Artificial Intelligence (AI) and Large Language Models (LLMs) to automate and enhance data extraction, processing, and analysis workflows.

Role Overview

The Data Research Engineer will help shape how Forbes Advisor leverages AI and LLM technologies to streamline data operations, optimize research workflows, and build intelligent, scalable data systems.

This is a forward-looking role that combines elements of Data Engineering and AI/LLM Engineering. The ideal candidate is a creative problem-solver who proactively explores emerging technologies and identifies innovative ways to harness their potential for business impact.

Key Responsibilities
  • Develop and implement methods to leverage AI and LLMs (Large Language Models) for process automation and data research efficiency.

  • Proactively identify new AI/LLM-based solutions to streamline operations and improve data workflows.

  • Act as a visionary for AI/LLM adoption, anticipating future technological developments and preparing the team to capitalize on them early.

  • Assist in acquiring and integrating data from multiple sources, including web crawling, APIs, and other data pipelines.

  • Design and optimize ETL workflows to ensure high-quality data availability for downstream analysis.

  • Explore and evaluate third-party tools for modernizing legacy data systems and enhancing scalability.

  • Collaborate cross-functionally with content, research, and analytics teams to understand and fulfill data requirements.

  • Ensure timely delivery of project milestones in a fast-paced, dynamic environment.

  • Support and collaborate with fellow engineers on the Data Research and Extraction Team.

  • Utilize online technical resources effectively (e.g., StackOverflow, ChatGPT, Bard) while understanding their limitations.

Required Skills & Experience
  • Bachelor’s degree in Computer Science, Data Science, Engineering, or a related field (advanced degree is a plus).

  • Minimum 4+ years of experience in Data Engineering, AI/ML Engineering, or related fields.

  • Strong proficiency in Python for data manipulation, automation, and API integration.

  • Experience in AI/ML engineering and data extraction workflows.

  • Proficiency in implementing Retrieval-Augmented Generation (RAG) pipelines using tools such as ChromaDB or Pinecone.

  • Experience with agentic AI platforms (e.g., CrewAI, LangChain) for modular and autonomous task execution.

  • Hands-on experience working with LLMs, including prompt engineering and mitigation of model “hallucinations.”

  • Familiarity with machine learning frameworks such as TensorFlow or PyTorch.

  • Exposure to NLP frameworks (spaCy, NLTK, Hugging Face, etc.).

  • Understanding of SQL and data querying (a plus).

  • Familiarity with web crawling techniques and API integration (a plus).

  • Experience using version control tools such as Git for collaborative development.

  • Strong problem-solving, analytical, and critical-thinking skills.

  • Excellent communication and teamwork abilities.

  • Ability to thrive in a high-growth environment with shifting priorities.

  • Experience managing or mentoring AI/LLM-focused teams is a plus.

  • Familiarity with Agile development methodologies (a plus).

Perks & Benefits
  • Day off on the 3rd Friday of every month (one long weekend each month)

  • Monthly Wellness Reimbursement Program to promote health and well-being

  • Monthly Office Commutation Reimbursement Program

  • Paid Paternity and Maternity Leave

Summary Classification

This position blends Data Engineering and AI/LLM Engineering expertise, focusing on data acquisition, pipeline automation, and intelligent AI-driven workflows.
Ideal for candidates with 4+ years of experience in building scalable data systems, integrating AI/LLM technologies, and driving innovation in data operations.

Skills Required

  • Bachelor's degree in Computer Science, Data Science, Engineering, or related field (advanced degree a plus).
  • Minimum 4+ years experience in Data Engineering, AI/ML Engineering, or related fields.
  • Strong proficiency in Python for data manipulation, automation, and API integration.
  • Experience in AI/ML engineering and data extraction workflows.
  • Proficiency implementing Retrieval-Augmented Generation (RAG) pipelines using tools such as ChromaDB or Pinecone.
  • Experience with agentic AI platforms (e.g., CrewAI, LangChain).
  • Hands-on experience working with LLMs, including prompt engineering and hallucination mitigation.
  • Familiarity with machine learning frameworks such as TensorFlow or PyTorch.
  • Exposure to NLP frameworks (spaCy, NLTK, Hugging Face).
  • Experience using version control tools such as Git for collaborative development.
  • Understanding of SQL and data querying.
  • Familiarity with web crawling techniques and API integration.
  • Strong problem-solving, analytical, communication, and teamwork skills; ability to thrive in a high-growth environment.
  • Experience managing or mentoring AI/LLM-focused teams.
  • Familiarity with Agile development methodologies.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
100 Employees

What We Do

NextHire Consulting is an AI-driven recruiting platform that streamlines the hiring process for companies. By leveraging AI agents for sourcing, screening, and interviewing, the platform enables teams to focus on pre-qualified finalists. It provides data-driven insights into candidate soft skills and behavioral styles, aiming to disrupt traditional recruitment models with efficient, automated, and science-based talent acquisition solutions for businesses of all sizes.

Similar Jobs

Capco Logo Capco

Axiom BA

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Remote or Hybrid
India
6000 Employees

Zscaler Logo Zscaler

Lead Technical Partner Enablement Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Remote or Hybrid
India
8697 Employees

GitLab Logo GitLab

Account Executive

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
India
2500 Employees

Atlassian Logo Atlassian

Senior Data Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
In-Office or Remote
Bengaluru, Bengaluru Urban, Karnataka, IND
11000 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account