The Role
Design, develop, and manage data pipelines and ETL workflows using Databricks, PySpark, and SQL. Build frameworks for data ingestion and optimize data architecture for analytics. Collaborate with stakeholders and support data governance and integration activities.
Summary Generated by Built In
Location: Indore/RaipurExperience: 2 to 4 yearsKey Responsibilities:
- Design, develop, and manage scalable data pipelines and ETL workflows using Databricks, PySpark, and SQL for large-scale data processing.
- Build and maintain data ingestion frameworks to extract data from enterprise systems such as SAP APIs, REST services, and relational databases.
- Develop and optimize Delta Lake based data architecture to ensure reliable, high-performance data storage and processing.
- Design and implement data transformation pipelines to convert raw data into curated datasets for analytics and reporting.
- Optimize Spark jobs and SQL queries to improve performance and reduce compute costs.
- Implement data quality validation, monitoring, and error handling frameworks for reliable pipeline execution.
- Build automated workflow orchestration and scheduling mechanisms for end-to-end data processing pipelines.
- Collaborate with data analysts, business stakeholders, and platform teams to design efficient data solutions.
- Develop and maintain data models and schema design for data lake and downstream analytical systems.
- Support data platform engineering activities, including cluster configuration, performance tuning, and reusable utility development.
- Troubleshoot production pipeline failures, data inconsistencies, and performance issues.
- Develop Python utilities and frameworks to support data ingestion, transformation, and automation tasks.
- Implement data governance, security, and access control standards across enterprise data pipelines.
- Participate in code reviews, documentation, and best practices to improve overall data engineering standards.
- Support large-scale data integrations and migrations from legacy systems to modern cloud data platforms.
- Ownership of the entire data pipeline lifecycle, from development to deployment
- 2+ years of experience in Data Engineering, Data Pipeline Development, and Data Processing.
- Strong experience with Python, PySpark, and SQL for large-scale data transformations.
- Hands-on experience with Databricks, Delta Lake, and distributed data processing frameworks.
- Experience integrating data from REST APIs, SAP systems, and enterprise data sources.
- Strong knowledge of data modeling, schema design, and ETL best practices.
- Experience working with cloud data platforms (GCP / AWS / Azure) and cloud storage systems.
- Experience with workflow orchestration, job scheduling, and automated data pipelines.
- Ability to optimize Spark workloads and troubleshoot performance issues in large datasets.
- Strong problem-solving skills and ability to work in fast-paced data platform environments.
Skills Required
- 2+ years of experience in Data Engineering, Data Pipeline Development, and Data Processing
- Strong experience with Python, PySpark, and SQL
- Hands-on experience with Databricks and Delta Lake
- Experience integrating data from REST APIs and SAP systems
- Strong knowledge of data modeling and ETL best practices
- Experience working with cloud data platforms (GCP/AWS/Azure)
- Experience with workflow orchestration and automated data pipelines
- Ability to optimize Spark workloads and troubleshoot performance issues
Am I A Good Fit?
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.
Success! Refresh the page to see how your skills align with this role.
The Company
What We Do
We Empower & Transform customers’ business through the use of digital technologies. Our core focus areas are Big-Data, Cloud, Analytics (AI, ML), Blockchain, Automation & Mobility. We enable navigation of digital transformation for several fortune 1000 clients in USA, Canada, UK & India. NucleusTeq is a software services, solutions & products company empowering & transforming customers’ business through the use of digital technologies such as Big-Data, Analytics (AI, ML), Cloud, Enterprise Automation, Block-chain, Mobility, CRM & ERP.








