AI Data Engineer

Posted 8 Days Ago
Be an Early Applicant
Bengaluru, Bengaluru Urban, Karnataka, IND
In-Office
Mid level
Artificial Intelligence • Cloud • Information Technology • Consulting
The Role
The AI Data Engineer will manage data pipelines, ensure data quality, and collaborate with AI Engineers to prepare datasets for machine learning, using tools like Azure and Power BI.
Summary Generated by Built In
AI Data Engineer

  

This role has been designed as ‘’Onsite’ with an expectation that you will primarily work from an HPE office.

Who We Are:

Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.

Job Description:

   

HPE Financial services is where we help organizations create the investment they need for digital transformation, in an innovative and sustainable way. We partner with customers across their entire IT asset portfolio from edge to cloud to end-user. Unique to each client’s aspirations and size, our financial and asset management solutions are anchored by best-in-class tech upcycling services. Join us redefine what’s next for you. 

Role summary

We are looking for a technically sharp and detail-oriented Data Engineer to join HPEFS (Hewlett Packard Enterprises Financial Services - Advanced Analytics & BI team Bangalore. This role is the data backbone that powers our AI capabilities — working in close partnership with the AI Engineers to ensure that the data flowing into AI models, dashboards, and business workflows is clean, governed, and well-structured. This role will play a hands on role and own the backend data lifecycle: ingesting raw data from diverse sources, transforming it into reliable, analysis-ready datasets, enforcing data quality standards, and publishing governed data products via Microsoft Fabric and Databricks. You will also support reporting needs through Power BI and contribute to Collibra-based data governance initiatives. A working familiarity with Microsoft Copilot and AI-assisted data tooling is expected

What you'll do:

Data Engineering & Transformation

  • Design, build, and maintain scalable ETL/ELT pipelines using Azure Data Factory, Databricks (PySpark / Delta Live Tables), and Microsoft Fabric Data Factory.

  • Transform raw, multi-source data into clean, conformed, and analytics-ready datasets following Medallion Architecture principles (Bronze → Silver → Gold).

  • Develop and optimize SQL and PySpark-based transformation logic for structured, semi-structured, and unstructured data.

  • Implement incremental load patterns, merge/upsert logic, and slowly changing dimension (SCD) strategies to support historical data tracking.

  • Collaborate with the AI Engineers to prepare high-quality feature datasets for ML and LLM use cases.

Data Quality & Governance

  • Define, implement, and monitor data quality rules including completeness, accuracy, consistency, timeliness, and uniqueness checks.

  • Administer and extend the Collibra data governance platform — including business glossary management, data lineage documentation, and stewardship workflows.

  • Build automated data quality validation frameworks using tools such as Great Expectations, dbt tests, or Unity Catalog data quality constraints in Databricks.

  • Triage and resolve data quality incidents, root-cause data anomalies, and communicate impact to stakeholders proactively.

  • Maintain metadata catalogues and ensure all critical datasets have documented ownership, lineage, and classification.

Microsoft Fabric & Lakehouse

  • Build and manage Lakehouses, Warehouses, and Dataflows Gen2 within the Microsoft Fabric ecosystem.

  • Configure OneLake, shortcuts, and mirroring to unify data across sources without unnecessary duplication.

  • Leverage Fabric Notebooks (PySpark / Python) and Spark job definitions for large-scale data processing.

  • Support the semantic model layer in Fabric to ensure Power BI datasets are optimized and governed

Power BI & Reporting

  • Develop and maintain Power BI semantic models (star schema design, DAX measures, row-level security).

  • Build production-grade dashboards and reports for business stakeholders; ensure refresh reliability and performance.

  • Apply Copilot-assisted authoring in Power BI and Fabric where applicable to accelerate report generation.

  • Support self-service analytics adoption by publishing governed datasets to the Power BI service

Collaboration & AI Enablement

  • Partner closely with the AI Engineers, peer data scientist and analytics team members to supply clean, structured data for RAG pipelines, model training, and agentic workflows.

  • Contribute to the design of shared data contracts and API schemas between data engineering and AI engineering layers.

  • Assist with AI-assisted data tasks using Microsoft Copilot (in Fabric, Power BI, and Azure environments).

What you need to bring:

Qualifications

  • Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, Mathematics, or a related discipline.

  • 4 – 5 years of hands-on experience in data engineering, ETL development, or analytics engineering roles.

  • Demonstrable experience with Databricks and/or Microsoft Fabric in a production environment.

  • Proficiency in Power BI report and semantic model development.

  • Exposure to Collibra or equivalent data governance / cataloguing platforms is strongly preferred.

  • Strong SQL and Python skills; PySpark experience is required.

  • Familiarity with Azure cloud services and DevOps practices for data pipeline deployment

Technical Skill Requirements

  • Data Platforms - Databricks (PySpark, Delta Lake, Delta Live Tables, Unity Catalog), Microsoft Fabric (Lakehouse, Warehouse, Dataflows Gen2, Notebooks), Azure Data Lake Storage Gen2

  • Data Transformation - PySpark, SQL, dbt (data build tool), Azure Data Factory, Fabric Data Factory; Medallion Architecture, SCD types, incremental load patterns

  • Data Modelling - Star schema, snowflake schema, dimensional modelling, data vault concepts; normalization, entity-relationship design, semantic layer design

  • Reporting & BI - Power BI (DAX, semantic models, RLS, Power Query / M), Microsoft Fabric Power BI integration, Copilot-assisted authoring in Power BI

  • Programming - Python (primary), SQL (advanced); PySpark; familiarity with JSON, Parquet, Delta file formats

  • Cloud & DevOps - Azure (preferred): Synapse, ADF, ADLS Gen2, Key Vault; Git/GitHub for version control; CI/CD basics for pipeline deployment

  • Data Governance & Cataloguing - data lineage documentation, metadata management, data classification and tagging, business glossary ownership

  • AI & Copilot Tooling - Microsoft Copilot in Fabric / Power BI; familiarity with AI-assisted data transformation; understanding of LLM data requirements (embeddings, chunking, vector-ready formats)

  • Data Concepts - Data warehousing, lakehouse architecture, OLAP vs OLTP, event-driven ingestion, streaming basics (Structured Streaming / Event Hubs), data contracts, master data management (MDM)

#Financialservices

Additional Skills:

Accountability, Accountability, Action Planning, Active Learning, Active Listening, Agile Methodology, Agile Scrum Development, Analytical Thinking, Bias, Coaching, Creativity, Critical Thinking, Cross-Functional Teamwork, Data Analysis Management, Data Collection Management (Inactive), Data Controls, Design, Design Thinking, Empathy, Follow-Through, Group Problem Solving, Growth Mindset, Intellectual Curiosity (Inactive), Long Term Planning, Managing Ambiguity {+ 5 more}

What We Can Offer You:

Health & Wellbeing

We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.

Personal & Professional Development

We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.

Unconditional Inclusion

We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.

Let's Stay Connected:

Follow @HPECareers on Instagram to see the latest on people, culture and tech at HPE.

#india

Job:

Engineering

Job Level:

TCP_03

    

    

HPE is an Equal Employment Opportunity/ Veterans/Disabled/LGBT employer. We do not discriminate on the basis of race, gender, or any other protected category, and all decisions we make are made on the basis of qualifications, merit, and business need. Our goal is to be one global team that is representative of our customers, in an inclusive environment where we can continue to innovate and grow together. Please click here: Equal Employment Opportunity.

Hewlett Packard Enterprise is EEO Protected Veteran/ Individual with Disabilities.

   

HPE will comply with all applicable laws related to employer use of arrest and conviction records, including laws requiring employers to consider for employment qualified applicants with criminal histories.

   

No Fees Notice & Recruitment Fraud Disclaimer

 

It has come to HPE’s attention that there has been an increase in recruitment fraud whereby scammer impersonate HPE or HPE-authorized recruiting agencies and offer fake employment opportunities to candidates.  These scammers often seek to obtain personal information or money from candidates.

 

Please note that Hewlett Packard Enterprise (HPE), its direct and indirect subsidiaries and affiliated companies, and its authorized recruitment agencies/vendors will never charge any candidate a registration fee, hiring fee, or any other fee in connection with its recruitment and hiring process.  The credentials of any hiring agency that claims to be working with HPE for recruitment of talent should be verified by candidates and candidates shall be solely responsible to conduct such verification. Any candidate/individual who relies on the erroneous representations made by fraudulent employment agencies does so at their own risk, and HPE disclaims liability for any damages or claims that may result from any such communication.

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Houston, TX
85,422 Employees
Year Founded: 2015

What We Do

In 1939, Bill Hewlett and Dave Packard, college friends turned business partners, started the original Silicon Valley startup in the space of a rented Palo Alto garage. Starting with audio oscillators, the friends built the foundation for a company that would grow to become a global leader in enterprise technology. More than 75 years later, our success is exemplified through our employees’ drive to advance ideas that bring meaningful innovations to life for our customers and partners around the globe. We are guided by our mission to help customers use technology to turn ideas into value, and empower them to transform industries, markets and lives. We simplify Hybrid IT, power the Intelligent Edge and provide the expertise to make it all happen.

Similar Jobs

Kyndryl Logo Kyndryl

Artificial Intelligence Engineer

Cloud • Information Technology • Consulting
In-Office
2 Locations
46070 Employees
In-Office
560064, Yelahanka, Karnataka, IND
58338 Employees

Hewlett Packard Enterprise Logo Hewlett Packard Enterprise

Data Engineer

Artificial Intelligence • Cloud • Information Technology • Consulting
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
85422 Employees
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
72000 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account