Principal Data & AI Engineer

Posted 2 Days Ago
Be an Early Applicant
2 Locations
In-Office
70K-138K Annually
Expert/Leader
Healthtech • Biotech • Pharmaceutical • Manufacturing
The Role
Lead design and implementation of enterprise AI-ready data platforms and pipelines (batch/stream) enabling Generative AI, RAG, vector search, and LLM-powered applications. Define architecture, engineering standards, governance, observability, and MLOps/LLMOps practices. Partner with product and engineering teams to productionize data services, context-assembly pipelines, and agentic AI workflows while balancing cost, performance, security, and compliance.
Summary Generated by Built In

At Johnson & Johnson, we believe health is everything. Our strength in healthcare innovation empowers us to build a world where complex diseases are prevented, treated, and cured, where treatments are smarter and less invasive, and solutions are personal. Through our expertise in Innovative Medicine and MedTech, we are uniquely positioned to innovate across the full spectrum of healthcare solutions today to deliver the breakthroughs of tomorrow, and profoundly impact health for humanity. Learn more at jnj.com.

As guided by Our Credo, Johnson & Johnson is responsible to our employees who work with us throughout the world. We provide an inclusive work environment where each person is considered as an individual. At Johnson & Johnson, we respect the diversity and dignity of our employees and recognize their merit.

Job Function:

Data Analytics & Computational Sciences

Job Sub Function:

Data Engineering

Job Category:

Scientific/Technology

All Job Posting Locations:

Beerse, Antwerp, Belgium, Limerick, Ireland

Job Description:

We are currently recruiting for a Principal Data Engineer – AI & Generative AI Platforms based in Limerick - Ireland or Beerse - Belgium.

 

Role Overview

The Principal Data Engineer – AI & Generative AI Platforms provides enterprise technical leadership for the design and evolution of scalable, governed data platform capabilities that power advanced analytics, machine learning, and next-generation AI solutions (including Generative AI and Agentic AI). This role sets technical direction, establishes reference architectures and engineering standards, and drives measurable outcomes such as improved time-to-delivery, reduced platform/unit costs, and increased reuse and adoption across teams.

This role operates at the intersection of data engineering, AI platform engineering, and software engineering, driving cross-team alignment on how trusted, governed, and high-performance data ecosystems are designed and operated to support enterprise-scale AI workloads.

The successful candidate will play a key role in enabling AI-ready data foundations, supporting capabilities such as Retrieval-Augmented Generation (RAG), vector search, knowledge graphs, semantic layers, and LLM-powered applications. They will provide clear guidance on the benefits and trade-offs of these architectural components—and when to apply them—balancing cost, risk, performance, and governance requirements.

 

Key Responsibilities

AI & GenAI Data Platform Engineering

  • Architect and build core data platform capabilities that enable enterprise AI, machine learning, and Generative AI workloads, delivering scalable, governed solutions with strong performance and cost efficiency.
  • Build and operate scalable pipelines for structured and unstructured data ingestion for AI training and inference workloads (batch/stream), implementing data quality checks, lineage capture, and clear SLAs.
  • Design, implement, and harden reusable data services and APIs that support large language models (LLMs), knowledge retrieval systems, and AI-powered applications, meeting reliability and latency targets.
  • Implement LLM abstraction and model-routing components to maintain underlying model flexibility (“right model for the job”), including evaluation gates, fallback strategies, and operational controls. 
  • Build integrations that connect enterprise knowledge across data lakes, document stores, APIs, and enterprise systems, enabling secure retrieval and reuse in AI applications while reducing duplicated point solutions.

Generative AI & Agentic AI Enablement

  • Design, build, and productionize Retrieval-Augmented Generation (RAG) pipelines, enabling GenAI models to access trusted enterprise data with strong latency, reliability, and evaluation coverage. 
  • Build and run pipelines for embedding generation, vector indexing, and semantic search capabilities, including chunking strategies, refresh schedules, and cost-aware scaling.
  • Partner hands-on with product and engineering teams to ship AI copilots, conversational AI solutions, and agent-based AI workflows, integrating approved data access patterns and tooling.
  • Engineer infrastructure that supports multi-agent orchestration and tool-enabled AI systems. Design and implement robust context assembly pipelines (retrieval, ranking, summarization, caching) to balance quality, latency, and cost.
  • Design and implement streamlined context-collection strategies with telemetry and evaluation loops that maximize performance and accuracy and minimize expense.
  • Engineer systems that are resilient to common issues such as context pressure and context rot, using refresh/invalidation, drift detection, and re-embedding strategies.

 

Data Pipeline & Platform Development

  • Build and operate scalable data ingestion, transformation, and serving pipelines supporting analytics and AI workloads with data quality embedded.
  • Develop robust data models and data products enabling self-service analytics and AI development.
  • Implement real-time and batch pipelines supporting operational intelligence and AI-driven applications.
  • Apply modern engineering practices including CI/CD, automated testing, and infrastructure-as-code.
  • Apply self-verifiable agentic feedback loops, bridging non-determinism to compliance (as close as possible) before human-in-the-loop.

 

LLMOps, MLOps & AI Observability

  • Build and operationalize AI systems using LLMOps and MLOps frameworks, including deployment automation, versioning, and repeatable release processes.
  • Implement end-to-end observability for AI systems including data lineage, prompt and model evaluation, monitoring, and performance tracking.

 

Data Governance, Trust & Responsible AI

  • Implement enterprise data governance practices including metadata management, lineage, and data cataloging.
  • Ensure data platforms support security, privacy, and regulatory requirements.
  • Contribute to responsible AI practices by ensuring traceability, transparency, and auditability of data used in AI systems.

 

Collaboration & Delivery

  • Lead hands-on delivery with Data Scientists, AI Engineers, Software Engineers, and Product Teams to ship reliable data products and AI-enabled capabilities end-to-end.
  • Translate priority business use cases into well-scoped technical designs, drive engineering alignment through design/architecture reviews, and mentor engineers on implementation patterns and operational rigor.

 

Required Qualifications

Education

Bachelor’s or Master’s degree in:

  • Computer Science
  • Data Engineering
  • Software Engineering
  • Artificial Intelligence
  • or a related technical discipline

 

Experience

  • Strong experience in data engineering and modern cloud data platforms.
  • Expertise in Python, SQL, or Scala.
  • Experience developing scalable data pipelines and data platforms.
  • Experience supporting machine learning or AI development workflows.
  • Strong understanding of data modelling, distributed data processing, and data architecture.

 

AI / GenAI Experience

Experience with one or more of the following:

  • Retrieval-Augmented Generation (RAG) pipelines
  • Vector databases and semantic search
  • Embedding generation workflows
  • LLM integration patterns
  • AI application architectures
  • AI platform engineering

 

Preferred Technical Skills

Data Platforms

  • Snowflake
  • Databricks
  • Microsoft Fabric
  • BigQuery

Data Engineering

  • Apache Spark
  • DBT
  • Airflow / workflow orchestration
  • Delta / Iceberg / Parquet data formats

AI & GenAI Technologies

  • vector databases
  • LLM orchestration frameworks
  • AI application frameworks
  • LLM evaluation and observability tools

Cloud Platforms

  • AWS
  • Microsoft Azure
  • Google Cloud

 

Key Competencies

  • Strong engineering mindset and problem-solving ability
  • Ability to design scalable, resilient data systems
  • Curiosity and passion for AI innovation
  • Strong collaboration and communication skills

 

Impact of the Role

This role enables the enterprise to transform data into AI-ready assets, empowering advanced analytics, intelligent automation, and Generative AI solutions that accelerate innovation and deliver measurable business value.



Required Skills:



Preferred Skills:

Advanced Analytics, Agility Jumps, Coaching, Critical Thinking, Data Engineering, Data Governance, Data Modeling, Data Privacy Standards, Data Science, Digital Fluency, Execution Focus, Hybrid Clouds, Organizing, Presentation Design, Technical Development, Technical Writing, Technologically Savvy

  


The anticipated pay range for this position, in the primary posting location, is:

€70.100,00 - €121.210,00

The anticipated pay ranges for additional locations are:


The anticipated base pay range for this position in BELGIUM is EUR 79.800 to EUR 137.770

Benefits:

In addition to base pay, we offer the following benefits*: an annual bonus with set target (% of pay) depending on pay grade / location, where the actual amount is based on the employees’ and companies’ performance of the previous calendar year, or sales commissions. Moreover, we offer vacation days, parental leave for a minimum of 12 weeks, bereavement leave, caregiver leave, volunteer leave, well-being reimbursement, programs for financial, physical and mental health. We also offer service anniversary and recognition awards, and subject to the terms of their respective plans, employees - and in some location’s eligible dependents - can participate in several insurance plans. For more information, visit Employee benefits | Supporting well-being & career growth | Johnson & Johnson Careers.


*This is for informative purposes only. Amounts and actual benefits may vary by location and are subject to change.



Skills Required

  • Bachelor's or Master's degree in Computer Science, Data Engineering, Software Engineering, Artificial Intelligence, or related technical discipline
  • Strong experience in data engineering and modern cloud data platforms
  • Expertise in Python, SQL, or Scala
  • Experience developing scalable data pipelines and data platforms (batch and streaming)
  • Experience supporting machine learning or AI development workflows
  • Strong understanding of data modelling, distributed data processing, and data architecture
  • Experience with Retrieval-Augmented Generation (RAG) pipelines
  • Experience with vector databases and semantic search
  • Experience with embedding generation workflows
  • Experience with LLM integration patterns and AI application architectures
  • Experience with AI platform engineering, LLMOps, MLOps and AI observability (deployment automation, versioning, monitoring)
  • Familiarity with data governance, metadata management, lineage, and data cataloging to support security and compliance
  • Experience with Snowflake
  • Experience with Databricks
  • Experience with Microsoft Fabric
  • Experience with BigQuery
  • Experience with Apache Spark
  • Experience with dbt
  • Experience with Airflow or other workflow orchestration
  • Experience with Delta Lake, Apache Iceberg, or Parquet data formats
  • Experience with cloud platforms (AWS, Azure, Google Cloud)

Johnson & Johnson Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Johnson & Johnson and has not been reviewed or approved by Johnson & Johnson.

  • Healthcare Strength Healthcare coverage is characterized as comprehensive across medical, dental, and vision, with added supports like onsite clinics, fitness centers, and Employee Assistance resources. Mental-health services and wellbeing reimbursements are also described as meaningful components of the overall package.
  • Retirement Support Retirement offerings are portrayed as a major differentiator, combining a 401(k) with employer matching and an employer-funded pension plan. Stock options and other long-term financial supports are also positioned as part of the broader rewards mix.
  • Parental & Family Support Family-related benefits are presented as notably strong, including paid parental leave for all new parents and additional leave types for caregiving and bereavement. Financial assistance for adoption, fertility treatment, and surrogacy is highlighted as a significant support.

Johnson & Johnson Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New Brunswick, NJ
143,612 Employees
Year Founded: 1886

What We Do

Profound Change Requires Boldness. Johnson & Johnson is the largest and most broadly based healthcare company in the world. We’re producing life-changing breakthroughs every day, and have been for the last 130 years. The combination of new technologies and your expertise enables amazing things to happen. Teams from J&J’s consumer business are creating digital tools to help people track the health of their skin. Those working in medical devices are 3-D printing artificial joints personalized for each patient, while researchers in pharmaceuticals use AI to discover lifesaving drugs. Imagine what the rest of our team of 134,000 people at 260 companies in more than 60 countries across the world is accomplishing. We redefine what it means to be a big company in today’s world. Social Media Community Guidelines: http://www.jnj.com/social-media-community-guidelines

Similar Jobs

CrowdStrike Logo CrowdStrike

Sr. Intelligence Analyst, Recon+ (Remote, GBR)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
5 Locations
10000 Employees

WHOOP Logo WHOOP

Sr. Program Specialist, Risk & Compliance Operations

Fitness • Hardware • Healthtech • Sports • Wearables
Easy Apply
Hybrid
2 Locations
500 Employees
60K-80K Annually

Pfizer Logo Pfizer

Director of Operational Excellence (AI Transformation)

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
In-Office or Remote
30 Locations
121990 Employees
177K-294K Annually

Pfizer Logo Pfizer

Regulatory Intelligence Lead - Biologics/Vaccines

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
In-Office or Remote
28 Locations
121990 Employees

Similar Companies Hiring

Granted Thumbnail
Mobile • Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees
Fortune Brands Innovations Thumbnail
Manufacturing
Deerfield, IL
10000 Employees
Amalgamated Sugar Thumbnail
Food • Greentech • Agriculture • Industrial • Manufacturing
Boise, Idaho
768 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account