Principal Vector Data Engineer

Reposted 4 Days Ago
Be an Early Applicant
2 Locations
In-Office
Mid level
Healthtech • Pharmaceutical • Manufacturing
The Role
The Principal Vector Data Engineer will design and implement vector embedding models for multimodal biomedical data processing, develop quality control protocols, and contribute to digital biomarker identification. The role involves collaboration with cross-functional teams to advance machine learning applications in neurodegenerative disorders.
Summary Generated by Built In

At Johnson & Johnson, we believe health is everything. Our strength in healthcare innovation empowers us to build a world where complex diseases are prevented, treated, and cured, where treatments are smarter and less invasive, and solutions are personal. Through our expertise in Innovative Medicine and MedTech, we are uniquely positioned to innovate across the full spectrum of healthcare solutions today to deliver the breakthroughs of tomorrow, and profoundly impact health for humanity. Learn more at https://www.jnj.com

Job Function:

Data Analytics & Computational Sciences

Job Sub Function:

Data Science

Job Category:

Scientific/Technology

All Job Posting Locations:

Cornellà de Llobregat, Barcelona, Spain, Madrid, Spain

Job Description:

Johnson and Johnson Innovative Medicine (J&J IM), a pharmaceutical company of Johnson & Johnson is recruiting for a Vector Data Engineer.  This position has a primary location of Barcelona, Spain. The secondary location is Madrid. This is a hybrid role. 

Our expertise in Innovative Medicine is informed and inspired by patients, whose insights fuel our science-based advancements. Visionaries like you work in teams that save lives by developing the medicines of tomorrow.  

Join us in developing treatments, finding cures, and pioneering the path from lab to life while championing patients every step of the way. Learn more at https://www.jnj.com/innovative-medicine 

 

Position Summary: 

The Principal Vector Data Engineer is a technical and strategic leader operating at the intersection of AI, digital health, and therapeutic R&D. This role leads the development of multimodal vector embedding pipelines and foundation model architectures supporting longitudinal data integration, disease progression modeling, and digital biomarker discovery across Neuroscience, Oncology, and Immunology. The successful candidate will guide enterprise-scale vectorization efforts while ensuring compliance with clinical, regulatory, and GxP data standards.
Key Responsibilities:
Technical Leadership
• Lead the design, development, and optimization of vector embedding models for diverse biomedical modalities including clinical, regulatory, imaging (MRI, PET), and digital health data.
• Architect scalable, compliant embedding pipelines using modern vector database technologies (FAISS, Pinecone, Weaviate, Milvus, Chroma, etc.).
• Establish robust quality-control frameworks for mobile-captured images and convert pixel-level data into high-fidelity vector representations.
• Drive the adaptation of state-of-the-art academic methods into production-ready, GxP-aware foundation models.
• Oversee multimodal data integration efforts to enable semantic search, retrieval-augmented analysis, and clinical insight generation.
Cross-Functional & Regulatory Leadership
• Collaborate with data scientists, clinicians, engineering teams, and regulatory/QA partners to ensure models and data pipelines align with GxP, clinical governance, and documentation standards.
• Contribute to digital biomarker discovery and predictive modeling for neurodegenerative, neuropsychiatric, oncologic, and immunologic conditions.
• Mentor junior engineers and contribute to technical roadmap planning, architectural reviews, and AI strategy development.
Qualifications:

• MS/PhD in Computer Science, Electrical Engineering, Biomedical Engineering, or related discipline.

• 3+ years of experience in multimodal ML, vector representation learning, biomedical signal processing, or large-scale embedding systems.

• Expertise in Python, PyTorch/TensorFlow, Hugging Face, and multimodal embedding architectures (CLIP, MedCLIP, BioBERT, TimeSformer, etc.).

• Hands-on experience with vector indexing/search systems (FAISS, Pinecone, Weaviate, Milvus, Odrant, Chroma).

• Familiarity with sentence-transformers, LangChain, or LlamaIndex for semantic search and RAG workflows.

• Understanding of clinical trial data structures, longitudinal monitoring, GxP system requirements, and compliant data lifecycle management.

Strategic Impact:
• Enterprise biomedical data transformed into vectorized, interoperable assets powering scientific AI and semantic intelligence.
• Improved data governance, lineage, and GxP alignment across foundation models and vector pipelines.
• Accelerated discovery of digital biomarkers and predictive patterns across therapeutic areas.
• Scalable vector infrastructure enabling next-generation clinical and translational AI research.

 

#JRDDS  #JNJDataScience

 



Required Skills:



Preferred Skills:

Top Skills

Freesurfer
Fsl
Hugging Face Transformers
Librosa
Nibabel
Nilearn
Python
PyTorch
Scikit-Learn
Speechbrain
TensorFlow
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New Brunswick, NJ
143,612 Employees
Year Founded: 1886

What We Do

Profound Change Requires Boldness.

Johnson & Johnson is the largest and most broadly based healthcare company in the world. We’re producing life-changing breakthroughs every day, and have been for the last 130 years.

The combination of new technologies and your expertise enables amazing things to happen. Teams from J&J’s consumer business are creating digital tools to help people track the health of their skin. Those working in medical devices are 3-D printing artificial joints personalized for each patient, while researchers in pharmaceuticals use AI to discover lifesaving drugs. Imagine what the rest of our team of 134,000 people at 260 companies in more than 60 countries across the world is accomplishing. We redefine what it means to be a big company in today’s world.

Social Media Community Guidelines:
http://www.jnj.com/social-media-community-guidelines

Similar Jobs

Adyen Logo Adyen

Support Engineer

Fintech • Payments • Financial Services
Easy Apply
Hybrid
Madrid, Comunidad de Madrid, ESP
4568 Employees

ServiceNow Logo ServiceNow

Customer Success Executive

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Madrid, Comunidad de Madrid, ESP
28000 Employees
15-15 Annually

Nexthink Logo Nexthink

Support Engineer

Artificial Intelligence • Big Data • Cloud • Information Technology • Machine Learning • Software
Hybrid
Madrid, Comunidad de Madrid, ESP
1200 Employees

Nexthink Logo Nexthink

Contract Renewal Specialist - German Speaking

Artificial Intelligence • Big Data • Cloud • Information Technology • Machine Learning • Software
Hybrid
Madrid, Comunidad de Madrid, ESP
1200 Employees

Similar Companies Hiring

Axle Health Thumbnail
Logistics • Information Technology • Healthtech • Artificial Intelligence
Santa Monica, CA
17 Employees
Camber Thumbnail
Social Impact • Healthtech • Fintech
New York, NY
53 Employees
Sailor Health Thumbnail
Telehealth • Social Impact • Healthtech
New York City, NY
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account