Manager, Data Engineer

Posted Yesterday
Be an Early Applicant
Chennai, Tamil Nadu
Hybrid
Senior level
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
We’re in relentless pursuit of breakthroughs that change patients’ lives.
The Role
Manage the engineering of data pipelines, integrate diverse data sources, ensure data quality, and collaborate with teams for machine learning applications.
Summary Generated by Built In
Why Patients Need You
All over the world, Pfizer colleagues work together to positively impact health for everyone, everywhere. Our colleagues have the opportunity to grow and develop a career that offers both individual and company success; be part of an ownership culture that values diversity and where all colleagues are energized and engaged and have the ability to impact health and lives of millions of people. Digital is at the core of how Pfizer delivers Breakthroughs That Change Patient's Lives. Advanced technologies that accelerate research, development, manufacturing, and patient access to therapies are all made possible by the infrastructures that enable our Digital landscape.
What You Will Achieve
The Infrastructure & Automation organization delivers excellence in the pursuit of those breakthroughs through industry-leading service performance. We ensure optimal performance of the network and hosting services that power Pfizer's business processes. We strive to revolutionize service dependability by applying advanced analytics to drive predictive detection, identifying potential issues with our services, and intervening before they disrupt our business. We place data at the heart of what we do and apply a relentless focus on continuous improvement to enable Pfizer's business processes and patient outcomes.
ROLE SUMMARY
This role focuses on building, optimizing , and maintaining data pipelines, semantic data layers, and graph-based data models that support anomaly detection, incident prediction, event correlation, impact analysis, and automated knowledge generation. You will work extensively with digital IT service data, CMDB records, monitoring feeds, and infrastructure data from platforms like ServiceNow and cloud environments. Responsibilities include owning ingestion, processing, lineage, quality, and governance of complex IT operations data at scale-while shaping the semantic and ontological foundations that make infrastructure knowledge discoverable and machine-interpretable. The position requires strong engineering discipline, hands-on development ability, and comfort working with distributed systems and cross-functional teams.
ROLE RESPONSIBILITIES
  • Integrate and link data across monitoring tools, compliance and audit platforms, CMDB, documentation and knowledge systems to create a unified, trustworthy data ecosystem.
  • Build and maintain scalable data pipelines that ingest, process and transform infrastructure data from ServiceNow, monitoring systems, CMDB sources and cloud platforms.
  • Engineer Integrate and link data across monitoring tools, compliance and audit platforms, CMDB and knowledge systems to create a unified, trustworthy data ecosystem.
  • Build and maintain scalable data pipelines that ingest, process and transform infrastructure data from ServiceNow, monitoring systems, CMDB sources and cloud platforms.
  • Engineer data models to support ML-driven use cases such as anomaly detection, intelligent incident classification, dependency mapping and forecasting.
  • Implement data quality checks, validation frameworks, and anomaly detection mechanisms to ensure trustworthy analytics input.
  • Partner with AI/ML engineers and SRE teams to identify feature requirements, define data contracts and prepare training and inference datasets.
  • Collaborate with Data Engineers, platform SMEs, and analytics leads to align on semantic definitions across configuration items, assets and event data, while developing and governing semantic data layers that enforce consistent taxonomies, conceptual mappings, and standardized definitions across all datasets.
  • Optimize ETL/ELT jobs for cost performance and reliability across Azure, AWS, and on-prem platforms.
  • Maintain data lineage, metadata, and documentation to support governance, auditability, and downstream adoption.
  • Develop curated data products for PowerBI, predictive analytics and operational dashboards.
  • Lead data standardization efforts where possible and setup metrics that measure data quality.
  • Support platform integration initiatives by ensuring consistent data availability, versioning, and cross-system interoperabilitythrough semantic enrichment, entity resolution, and ontology-aligned metadata..
  • Implement ontology-driven models and use graph databases (e.g., Neo4j, Neptune, Cosmos DB Graph API) to power relationship based analytics and improve data discoverability.
  • Define and maintain domain ontologies aligned with IT operations, infrastructure, ITSM and configuration management data structures.

BASIC QUALIFICATIONS
  • Bachelor's degree in data engineering , computer science , or related field.
  • 5 + years of strong multi- discipline data experience with experience in data concepts and technologies .
  • Building and maintaining ETL/ELT pipelines for structured and unstructured data.
  • Proficiency in Python and SQL for developing scalable data workflows.
  • Expertise in Big Data tools (Spark, Hadoop, Databricks) and streaming frameworks (Kafka).
  • Implementing semantic metadata and linking data to business glossaries for AI-readiness.
  • Hands-on experience with data modeling, schema design, and event-driven architecture.
  • Skilled in SQL, NoSQL, and vector databases for AI workflows.
  • Managing data lakes, warehouses, and cloud storage solutions (Azure Data Factory, Synapse, AWS Glue, Redshift, S3).
  • Designing frameworks for data validation, anomaly detection, and lineage tracking.
  • Implementing observability for AI pipelines and operational datasets (incidents, changes, monitoring metrics).
  • Strong understanding of APIs, micro services integration, and workflow automation.
  • Ability to work with ML engineers to operationalize training data pipelines and inference-ready datasets.
  • Strong communication skills to translate data engineering outputs into actionable insights.

PREFERRED QUALIFICATIONS
  • Experience integrating with ServiceNow CMDB, ITSM, and operational data models.
  • Experience with Dataiku or similar platforms
  • Exposure to AIOps ecosystems (Dynatrace, Splunk, NetIM, Datadog).
  • Knowledge of graph databases and relationship modeling for dependency mapping.
  • Experience working with semantic technologies or enterprise ontologies.
  • Understanding ITIL, infrastructure performance metrics, and operational processes.

NON- STANDARD WORK SCHEDULE , TRAVEL OR ENVIRONMENT REQUIREMENTS
Standard work schedule
Work Location Assignment: Hybrid
Pfizer is an equal opportunity employer and complies with all applicable equal employment opportunity legislation in each jurisdiction in which it operates.
Marketing and Market Research

Top Skills

Aws Glue
Azure Data Factory
Azure Synapse
Databricks
Hadoop
Kafka
Python
Redshift
S3
Spark
SQL

What the Team is Saying

Daniel
Anna
Esteban
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
121,990 Employees
Year Founded: 1848

What We Do

Our purpose ensures that patients remain at the center of all we do. We live our purpose by sourcing the best science in the world; partnering with others in the healthcare system to improve access to our medicines; using digital technologies to enhance our drug discovery and development, as well as patient outcomes; and leading the conversation to advocate for pro-innovation/pro-patient policies.

Why Work With Us

We are the inventors, the problem solvers, the big thinkers — those who surmount any hurdle to deliver breakthrough medicines to the people who are counting on them the most.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery

Pfizer Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Not Specified
Company Office Image
HQHudson Yards
Provincia de Buenos Aires
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Andover, MA
Athens, GR
Chennai, IN
Collegeville, PA
Cork, IE
Dublin, IE
Durham, NC
Groton, CT
Kildare, IE
Madison, NJ
Madrid, ES
Mumbai, Maharashtra
Rochester, MI
San Diego, CA
Seattle, WA
Company Office Image
Heights Union East
Center for Digital Innovation
Learn more

Similar Jobs

Pfizer Logo Pfizer

Senior Manager, AI and Data Science Solution Engineer

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Hybrid
5 Locations
121990 Employees

Pfizer Logo Pfizer

Full-stack Engineer

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Hybrid
5 Locations
121990 Employees

Pfizer Logo Pfizer

Machine Learning Engineer

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Hybrid
Chennai, Tamil Nadu, IND
121990 Employees

Pfizer Logo Pfizer

Manager, Global Logistics Solutions

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Hybrid
Chennai, Tamil Nadu, IND
121990 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account