Data Engineer (PySpark and Apache Airflow)

Reposted Yesterday
Be an Early Applicant
Pune, Mahārāshtra, IND
In-Office
Mid level
Business Intelligence
The Role
Design, build, and optimize scalable ETL/ELT data pipelines using PySpark and orchestrate workflows with Apache Airflow. Process large distributed datasets, implement data quality checks, manage ingestion from varied sources, troubleshoot failures, implement CI/CD, and ensure governance and security while collaborating with analytics and BI teams.
Summary Generated by Built In

About Aligned Automation

At Aligned Automation, we live by our "Better Together" philosophy to build a better world. As a strategic service provider to Fortune 500 companies, we help digitize enterprise operations and drive impactful business strategies. Our purpose goes beyond projects—we strive to deliver meaningful, sustainable change that shapes a more optimistic and equitable future.

Our culture is deeply rooted in our 4Cs—Care, Courage, Curiosity, and Collaboration—ensuring that each employee is empowered to grow, innovate, and thrive in an inclusive workplace.

Job Summary

We are seeking a skilled Data Engineer with strong expertise in PySpark and Apache Airflow to design, build, and optimize scalable data pipelines. The ideal candidate should have experience in big data processing, workflow orchestration, and cloud-based data platforms.


Key Responsibilities

  • Design, develop, and maintain scalable ETL/ELT pipelines using PySpark
  • Build and manage workflow orchestration using Apache Airflow
  • Process large datasets using distributed computing frameworks (Spark)
  • Optimize data pipelines for performance, reliability, and scalability
  • Implement data quality checks and monitoring mechanisms
  • Work closely with Data Analysts, Data Scientists, and BI teams
  • Manage data ingestion from various sources (APIs, databases, flat files, streaming)
  • Troubleshoot and resolve pipeline failures
  • Implement CI/CD for data pipelines
  • Ensure data governance and security best practices

Required Skills

Technical Skills:

  • Strong hands-on experience in PySpark
  • Experience in Apache Airflow (DAGs, Operators, Scheduling)
  • Good understanding of Spark architecture
  • Strong SQL knowledge
  • Experience with data warehousing concepts
  • Experience with:
    • S3 / ADLS / GCS
    • Redshift / Snowflake / BigQuery
  • Knowledge of Git and version control
  • Understanding of REST APIs and data ingestion

Good to Have:

  • Experience with cloud platforms (AWS / Azure / GCP)
  • Experience with Kafka or streaming pipelines
  • Docker & Kubernetes knowledge
  • Delta Lake / Iceberg knowledge
  • Experience in CI/CD tools (Jenkins, GitHub Actions)
  • Experience in monitoring tools (Prometheus, Grafana)

Educational Qualification

  • Bachelor’s or Master’s degree in Computer Science, IT, Engineering, or related field

Soft Skills

  • Strong problem-solving skills
  • Good communication and collaboration skills
  • Ability to work in an agile environment
  • Ownership mindset and attention to detail


Skills Required

  • Hands-on experience in PySpark
  • Experience in Apache Airflow (DAGs, Operators, Scheduling)
  • Understanding of Spark architecture
  • Strong SQL knowledge
  • Experience with data warehousing concepts
  • Experience with cloud storage (S3, ADLS, GCS)
  • Experience with analytical databases (Redshift, Snowflake, BigQuery)
  • Knowledge of Git and version control
  • Understanding of REST APIs and data ingestion patterns
  • Experience ingesting data from APIs, databases, files, and streaming sources
  • Implement data quality checks and monitoring mechanisms
  • Implement CI/CD for data pipelines
  • Bachelor's or Master's degree in Computer Science, IT, Engineering, or related field
  • Experience with cloud platforms (AWS, Azure, GCP)
  • Experience with Kafka or streaming pipelines
  • Docker and Kubernetes knowledge
  • Delta Lake or Iceberg knowledge
  • Experience with CI/CD tools (Jenkins, GitHub Actions)
  • Experience with monitoring tools (Prometheus, Grafana)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Irving, TX
344 Employees
Year Founded: 2018

What We Do

Technology, society, economy, policy – all moving at breakneck speed in our 21st century world. You’re feeling the pressure to quickly implement new business models, find new value, make split-second informed decisions and keep one step ahead of customers. How? The answer lies in the ability to make quick, accurate and sustainable business decisions. We believe digital offers a way of doing things better – but the journey to transformation doesn’t have to be painful. At Aligned Automation, we work hard to digitally enable your business strategy – connecting processes, technologies and people to unlock value and drive critical business outcomes.

Similar Jobs

TransUnion Logo TransUnion

Analyst, AR Collections

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
Pune, Mahārāshtra, IND
13000 Employees

TransUnion Logo TransUnion

Senior Engineer

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
Pune, Mahārāshtra, IND
13000 Employees

TransUnion Logo TransUnion

Analyst, AR Collections

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
Pune, Mahārāshtra, IND
13000 Employees

TransUnion Logo TransUnion

Analyst, AR Collections

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
Pune, Mahārāshtra, IND
13000 Employees

Similar Companies Hiring

Energy CX Thumbnail
Greentech • Professional Services • Business Intelligence • Consulting • Energy • Financial Services • Utilities
Chicago, IL
108 Employees
Compa Thumbnail
Artificial Intelligence • HR Tech • Software • Business Intelligence
Irvine, California
75 Employees
Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account