Experience: 3–6 Years
Location: Noida
Employment Type: Full-Time
Work Mode: 5 days WFO
We are looking for a skilled Data Engineer with 3–6 years of experience to design, build, and maintain scalable data pipelines and data processing systems. The ideal candidate should have strong experience in PySpark, SQL, AWS services, and workflow orchestration tools like Airflow, along with exposure to big data technologies such as Hadoop.
Key ResponsibilitiesDesign, develop, and maintain scalable data pipelines for processing large datasets.
Build and optimize ETL/ELT workflows using PySpark and SQL.
Develop and manage data workflows using Apache Airflow for scheduling and orchestration.
Work with AWS data services to build robust and scalable data platforms.
Integrate and process data from multiple sources including structured and unstructured data.
Perform data transformation, cleansing, and aggregation to support analytics and reporting.
Optimize data processing jobs for performance, reliability, and scalability.
Collaborate with data scientists, analysts, and engineering teams to support data requirements.
Ensure data quality, governance, and security across pipelines.
Strong programming experience in PySpark and Python.
Strong knowledge of SQL and database concepts.
Hands-on experience with AWS services such as S3, Glue, EMR, Redshift, Lambda, or EC2.
Experience building data pipelines and ETL workflows.
Experience with Apache Airflow for workflow orchestration.
Knowledge of Hadoop ecosystem (HDFS, Hive, Spark).
Experience handling large-scale data processing and distributed systems.
Understanding of data modeling and data warehousing concepts.
Experience with Kafka or streaming data pipelines.
Experience with Docker or containerized environments.
Exposure to CI/CD pipelines and DevOps practices.
Experience with data lake architecture.
Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field.
Skills Required
- 3-6 years of experience in data engineering
- Strong programming experience in PySpark
- Strong programming experience in Python
- Strong knowledge of SQL and database concepts
- Hands-on experience with AWS data services (S3, Glue, EMR, Redshift, Lambda, EC2)
- Experience building data pipelines and ETL/ELT workflows
- Experience with Apache Airflow for workflow orchestration
- Knowledge of Hadoop ecosystem (HDFS, Hive, Spark)
- Experience handling large-scale data processing and distributed systems
- Understanding of data modeling and data warehousing concepts
- Bachelor's or Master's degree in Computer Science, IT, or related field
- Experience with Kafka or streaming data pipelines
- Experience with Docker or containerized environments
- Exposure to CI/CD pipelines and DevOps practices
- Experience with data lake architecture
What We Do
NextHire Consulting is an AI-driven recruiting platform that streamlines the hiring process for companies. By leveraging AI agents for sourcing, screening, and interviewing, the platform enables teams to focus on pre-qualified finalists. It provides data-driven insights into candidate soft skills and behavioral styles, aiming to disrupt traditional recruitment models with efficient, automated, and science-based talent acquisition solutions for businesses of all sizes.





