Senior AI Data Engineer

Posted 7 Hours Ago
Be an Early Applicant
15 Locations
Remote
Senior level
Agency
The Role
Design, build, and scale ETL/ELT and real-time data pipelines for AI workloads (RAG, fine-tuning, batch inference). Transform unstructured data into vectorized formats, manage feature stores and vector databases, enforce data quality/governance, integrate event systems (Kafka), and collaborate with ML and engineering teams.
Summary Generated by Built In

At TechBiz Global, we are providing recruitment service to our TOP clients from our portfolio.

We are currently looking for a dedicated Senior AI Data Engineer to join one of our clients' teams. If you're looking for an exciting opportunity to grow in an innovative environment, this could be the perfect fit for you.

 

Responsibilities:

 

▪ Design, build, and scale robust ETL/ELT pipelines optimized for AI workloads, including RAG, fine-tuning, and batch inference.

▪ Transform unstructured data sources such as PDFs, logs, and transcripts into structured and vectorized formats suitable for LLM consumption.

▪ Maintain and automate the data-to-model lifecycle, ensuring AI knowledge bases remain synchronized with changing business data.

▪ Develop and maintain real-time feature pipelines that support low-latency AI and machine learning applications.

▪ Integrate data platforms with Kafka and other event-driven systems to enable real-time processing and AI-driven responses.

▪ Manage and optimize Feature Stores to ensure consistency between model training and production environments.

▪ Implement automated data quality controls and validation processes to ensure the reliability and accuracy of AI training and inference data.

▪ Establish and maintain data lineage frameworks to provide traceability, auditability, and regulatory compliance across data workflows.

▪ Enforce data security, privacy, and governance standards, including PII protection and compliance with industry regulations.

▪ Manage data movement and synchronization across on-premises systems, cloud platforms, and data warehouses.

▪ Optimize data storage and retrieval strategies for Vector Databases to support high-performance RAG and AI search workloads.

▪ Collaborate with Data Scientists, ML Engineers, Software Engineers, and business stakeholders to deliver scalable AI data solutions.


Job requirements

10+ years of experience in Data Engineering or Backend Engineering with a strong focus on data platforms and pipelines.

▪ 2+ years of hands-on experience supporting AI/ML data pipelines, including data preparation for machine learning and generative AI applications.

▪ Expert-level proficiency in Python and SQL; experience with Java or Scala is an advantage.

▪ Strong experience building and maintaining real-time data streaming solutions using Apache Kafka, Flink, or Spark Streaming.

▪ Hands-on experience with modern data orchestration and transformation tools such as Airflow, dbt, and Prefect.

▪ Experience working with Vector Databases and Feature Stores to support AI and machine learning workloads.

▪ Strong knowledge of cloud-based data services on AWS, Azure, or GCP, including services such as Glue, Kinesis, Data Factory, or Dataflow.

▪ Experience deploying and managing data workloads in Kubernetes (K8s) environments.

▪ Proven experience handling sensitive data within regulated industries such as Fintech, Healthcare, or other compliance-driven environments.

▪ Strong understanding of data quality, governance, security, and privacy best practices.

▪ Bachelor's degree in Computer Science, Software Engineering, Information Systems, or a related technical field. Equivalent practical experience will also be considered.

▪ Excellent problem-solving skills and the ability to collaborate effectively with cross-functional engineering, data, and AI teams.

Skills Required

  • 10+ years of experience in Data Engineering or Backend Engineering with a strong focus on data platforms and pipelines.
  • 2+ years of hands-on experience supporting AI/ML data pipelines, including data preparation for machine learning and generative AI applications.
  • Expert-level proficiency in Python and SQL.
  • Experience with Java or Scala.
  • Strong experience building and maintaining real-time data streaming solutions using Apache Kafka, Flink, or Spark Streaming.
  • Hands-on experience with modern data orchestration and transformation tools such as Airflow, dbt, and Prefect.
  • Experience working with Vector Databases and Feature Stores to support AI and machine learning workloads.
  • Strong knowledge of cloud-based data services on AWS, Azure, or GCP, including services such as Glue, Kinesis, Data Factory, or Dataflow.
  • Experience deploying and managing data workloads in Kubernetes (K8s) environments.
  • Proven experience handling sensitive data within regulated industries such as Fintech, Healthcare, or other compliance-driven environments.
  • Strong understanding of data quality, governance, security, and privacy best practices.
  • Bachelor's degree in Computer Science, Software Engineering, Information Systems, or a related technical field (or equivalent practical experience).
  • Excellent problem-solving skills and ability to collaborate with cross-functional engineering, data, and AI teams.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
20 Employees

Similar Jobs

Remote or Hybrid
7 Locations
1151 Employees

Mondelēz International Logo Mondelēz International

o9 Change Manager MEU/CEE

Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Remote or Hybrid
7 Locations
90000 Employees

Mondelēz International Logo Mondelēz International

Sr. Analyst, Governance, Risk & Compliance (GRC), Information Security

Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Remote or Hybrid
Greece
90000 Employees

CSC Logo CSC

Client Legal Administrator

Fintech • Legal Tech • Software • Financial Services • Cybersecurity • Data Privacy
Remote or Hybrid
Athens, GRC
8500 Employees

Similar Companies Hiring

Caxy Thumbnail
Software • Mobile • Enterprise Web • Artificial Intelligence • Agency
Chicago, IL
45 Employees
Digible Thumbnail
Social Media • PropTech • Marketing Tech • Digital Media • Artificial Intelligence • Agency • AdTech
PH
145 Employees
Fora Thumbnail
Agency • On-Demand • Professional Services • Sales • Software • Travel • Hospitality
New York, NY
200 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account