PySpark Developer | Big Data, Cloud, SQL, Data Pipelines, DataOps

Reposted Yesterday
Be an Early Applicant
Tharamani, Chennai, Tamil Nadu, IND
In-Office
Senior level
Fintech • Financial Services
The Role
We are seeking a Senior PySpark Developer Data Engineer with extensive experience in data pipeline development, machine learning, and programming in Python and Spark. The candidate will lead data-driven projects, ensuring robust and clean code delivery, working collaboratively with teams and stakeholders.
Summary Generated by Built In

Job Summary

Synechron is seeking a highly experienced PySpark Developer to lead the development and optimization of large-scale data processing workflows. This role involves designing, building, and maintaining robust data pipelines using PySpark, Hadoop, and related big data technologies to support enterprise analytics, machine learning, and Data Science initiatives. The ideal candidate will drive data engineering best practices, ensure data quality and performance, and collaborate with cross-functional teams to deliver scalable and reliable data solutions aligned with organizational goals.

Software Requirements

Required:

  • Proficiency in Python, PySpark, and Spark (version 2.4+ or higher) for building scalable data pipelines.

  • Experience with Hadoop ecosystem components such as Hive, MapReduce, or HDFS.

  • Familiarity with SQL and NoSQL databases (e.g., PostgreSQL, MongoDB).

  • Version control skills with Git.

  • Development, orchestration, and scheduling tools such as Apache Airflow, Jenkins, or GitHub Actions.

  • Experience with containerization tools like Docker.

  • Data management and transformation using tools such as Pandas and Dask.

Preferred:

  • Knowledge of cloud data platforms such as AWS or Azure.

  • Familiarity with DataOps practices, automation tools, and monitoring dashboards such as Prometheus or Grafana.

Overall Responsibilities

  • Design, develop, and optimize large-scale data processing pipelines using PySpark and Hadoop technologies.

  • Build efficient, reliable, and scalable data workflows following best practices for performance and data quality.

  • Implement data transformations, feature engineering, and validation techniques for data science applications.

  • Collaborate with data scientists, analysts, and product teams to gather requirements and deliver impactful data solutions.

  • Conduct performance tuning and troubleshooting to resolve data pipeline issues and inefficiencies.

  • Automate deployment, testing, and operational activities incorporating CI/CD pipelines.

  • Maintain detailed documentation of data architectures, workflows, and operational procedures.

  • Support migration projects and cloud integrations to enhance data scalability and security.

Expected outcomes include high-performance data pipelines capable of managing large volume and velocity, with minimal downtime and high data integrity.

Technical Skills (By Category)

Programming Languages:

  • Essential: Python, PySpark (2.4+ or higher)

  • Preferred: Java, Scala (for big data processing and integration)

Databases/Data Management:

  • Essential: SQL (PostgreSQL, MySQL), NoSQL (MongoDB, similar)

  • Preferred: Data warehousing solutions (Snowflake, Redshift)

Cloud Technologies:

  • Preferred: AWS (S3, EMR, Glue), Azure Data Factory, cloud-based data processing

Frameworks & Libraries:

  • Essential: PySpark, Hadoop ecosystem (Hive, MapReduce), Pandas

  • Preferred: Dask, TensorFlow or other ML integration tools

Development & Automation Tools:

  • Essential: Git, Jenkins, CI/CD pipelines, Docker, Kubernetes (preferred)

  • Preferred: Terraform, CloudFormation, DataOps tools like Airflow

Security & Compliance:

  • Awareness of data encryption, role-based access, GDPR, HIPAA compliance.

Experience Requirements

  • Minimum of 7+ years of professional experience building large-scale data pipelines in production environments.

  • Hands-on expertise with PySpark, Hadoop, Spark, and data workflow orchestration tools.

  • Proven experience with data ingestion, transformation, and validation processes.

  • Experience supporting enterprise-scale data initiatives in finance, healthcare, retail, or similar sectors preferred; relevant experience in other industries also acceptable.

  • Strong troubleshooting, performance tuning, and optimization skills.

Day-to-Day Activities

  • Develop, test, and maintain scalable data pipelines using PySpark and related big data tools.

  • Optimize existing workflows and implement new features to improve data throughput and reliability.

  • Collaborate with data scientists, analysts, and engineers to understand data requirements and deliver effective solutions.

  • Monitor system performance with dashboards, troubleshoot bottlenecks, and resolve operational issues.

  • Automate deployment, testing, and data validation steps as part of CI/CD pipelines.

  • Document architecture, data flow, and operational procedures to support ongoing system maintenance and compliance.

  • Participate in team meetings, code reviews, and knowledge sharing sessions.

Roles involve technical leadership, problem-solving, and proactive communication to ensure data platform excellence.

Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or related fields.

  • 5+ years supporting enterprise-level big data environments, with a focus on PySpark and Hadoop ecosystems.

  • Certifications such as Cloudera, Hortonworks, or AWS Data Analytics are advantageous.

  • Proven ability to lead technical projects, troubleshoot complex issues, and optimize performance.

  • Strong verbal and written communication skills, with the ability to collaborate with diverse teams.

Professional Competencies

  • Strong analytical and problem-solving mindset focused on data quality, performance, and operational stability.

  • Leadership qualities to guide junior team members and influence best practices.

  • Effective communication skills for stakeholder engagement and reporting.

  • Adaptability to evolving big data technologies, cloud platforms, and business demands.

  • Ownership and initiative to implement continuous improvements.

  • Excellent time management and prioritization skills to handle multiple projects efficiently.

S​YNECHRON’S DIVERSITY & INCLUSION STATEMENT
 

Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.

All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.

Candidate Application Notice

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Maharashtra
12,827 Employees
Year Founded: 2001

What We Do

At Synechron, we believe in the power of digital to transform businesses for the better. Our global consulting firm combines creativity and innovative technology to deliver industry-leading digital solutions. Synechron’s progressive technologies and optimization strategies span end-to-end Artificial Intelligence, Consulting, Digital, Cloud & DevOps, Data, and Software Engineering, servicing an array of noteworthy financial services and technology firms. Through research and development initiatives in our FinLabs we develop solutions for modernization, from Artificial Intelligence and Blockchain to Data Science models, Digital Underwriting, mobile-first applications and more. Over the last 20+ years, our company has been honored with multiple employer awards, recognizing our commitment to our talented teams. With top clients to boast about, Synechron has a global workforce of 14,700+, and has 48 offices in 19 countries within key global markets. For more information on the company, please visit our website: www.synechron.com.

Similar Jobs

MongoDB Logo MongoDB

Senior Engineer

Big Data • Cloud • Software • Database
Easy Apply
Remote or Hybrid
India
5550 Employees

Magnite Logo Magnite

Senior Manager, Technical Operations, India

AdTech • Big Data • Digital Media • Software
Remote or Hybrid
India
950 Employees

CrowdStrike Logo CrowdStrike

Account Executive

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
India
10000 Employees

CrowdStrike Logo CrowdStrike

ServiceNow Administrator (Remote, IND)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
India
10000 Employees

Similar Companies Hiring

Granted Thumbnail
Mobile • Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees
Scotch Thumbnail
Artificial Intelligence • eCommerce • Fintech • Payments • Retail • Software • Analytics
US
35 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account