Senior Data Engineer - Python & Pyspark

Sorry, this job was removed at 12:06 p.m. (CST) on Monday, Dec 01, 2025
Be an Early Applicant
Primera, Tlachichuca, Puebla, MEX
In-Office
Fintech • Financial Services
The Role
The Senior Data Engineer will be responsible for the architecture, design, development, and maintenance of our data platforms, with a strong focus on leveraging Python and PySpark for data processing and transformation. This role requires a strong technical leader who can work independently and as part of a team, contributing to the overall data strategy and helping to drive data-driven decision-making across the organization.
Key Responsibilities
  • Data Architecture & Design: Design, develop, and optimize data architectures, pipelines, and data models to support various business needs, including analytics, reporting, and machine learning.
  • ETL/ELT Development (Python/PySpark Focus): Build, test, and deploy highly scalable and efficient ETL/ELT processes using Python and PySpark to ingest, transform, and load data from diverse sources into data warehouses and data lakes. Develop and optimize complex data transformations using PySpark.
  • Data Quality & Governance: Implement best practices for data quality, data governance, and data security to ensure the integrity, reliability, and privacy of our data assets.
  • Performance Optimization: Monitor, troubleshoot, and optimize data pipeline performance, ensuring data availability and timely delivery, particularly for PySpark jobs.
  • Infrastructure Management: Collaborate with DevOps and MLOps teams to manage and optimize data infrastructure, including cloud resources (AWS, Azure, GCP), databases, and data processing frameworks, ensuring efficient operation of PySpark clusters.
  • Mentorship & Leadership: Provide technical guidance, mentorship, and code reviews to junior data engineers, particularly in Python and PySpark best practices, fostering a culture of excellence and continuous improvement.
  • Collaboration: Work closely with data scientists, analysts, product managers, and other stakeholders to understand data requirements and deliver solutions that meet business objectives.
  • Innovation: Research and evaluate new data technologies, tools, and methodologies to enhance our data capabilities and stay ahead of industry trends.
  • Documentation: Create and maintain comprehensive documentation for data pipelines, data models, and data infrastructure.
Qualifications
Education
  • Bachelor's or Master's degree in Computer Science, Software Engineering, Data Science, or a related quantitative field.
Experience
  • 5+ years of professional experience in data engineering, with a strong emphasis on building and maintaining large-scale data systems.
  • Extensive hands-on experience with Python for data engineering tasks.
  • Proven experience with PySpark for big data processing and transformation.
  • Proven experience with cloud data platforms (e.g., AWS Redshift, S3, EMR, Glue; Azure Data Lake, Databricks, Synapse; Google BigQuery, Dataflow).
  • Strong experience with SQL and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB, Cassandra).
  • Extensive experience with distributed data processing frameworks, especially Apache Spark.
Technical Skills
  • Programming Languages: Expert proficiency in Python is mandatory. Strong SQL mastery is essential. Familiarity with Scala or Java is a plus.
  • Big Data Technologies: In-depth knowledge and hands-on experience with Apache Spark (PySpark) for data processing, including Spark SQL, Spark Streaming, and DataFrame API. Experience with Apache Kafka, Apache Airflow, Delta Lake, or similar technologies.
  • Data Warehousing: In-depth knowledge of data warehousing concepts, dimensional modeling, and ETL/ELT processes.
  • Cloud Platforms: Hands-on experience with at least one major cloud provider (AWS, Azure, GCP) and their data services, particularly those supporting Spark/PySpark workloads.
  • Containerization: Familiarity with Docker and Kubernetes is a plus.
  • Version Control: Proficient with Git and CI/CD pipelines.
Soft Skills
  • Excellent problem-solving and analytical abilities.
  • Strong communication and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholders.
  • Ability to work effectively in a fast-paced, agile environment.
  • Proactive and self-motivated with a strong sense of ownership.
Preferred Qualifications
  • Experience with real-time data streaming and processing using PySpark Structured Streaming.
  • Knowledge of machine learning concepts and MLOps practices, especially integrating ML workflows with PySpark.
  • Familiarity with data visualization tools (e.g., Tableau, Power BI).
  • Contributions to open-source data projects.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Data Analytics

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

 

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.

Citi Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Citi and has not been reviewed or approved by Citi.

  • Healthcare Strength Benefits coverage is positioned as comprehensive, including health, dental, and vision insurance plus on-site clinics, prescription drug support, and disability coverage. Family-building support such as fertility assistance is described as a notable differentiator within the overall package.
  • Retirement Support Retirement benefits are framed as strong, highlighted by a 401(k) with matching and additional plan options like a Roth 401(k). Financial support is reinforced through discounts and broader financial guidance resources tied to the benefits ecosystem.
  • Wellbeing & Lifestyle Benefits Wellbeing support extends beyond insurance through programs like an Employee Assistance Program, counseling/legal resources, and gym or wellness reimbursement. These offerings increase the perceived total rewards value even when cash compensation sentiment varies by role.

Citi Insights

Similar Jobs

CrowdStrike Logo CrowdStrike

Technical Account Manager

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
2 Locations
10000 Employees

Akamai Technologies Logo Akamai Technologies

Solutions Engineer

Cloud • Security • Software • Cybersecurity
In-Office or Remote
2 Locations
10285 Employees
8-8 Annually

Akamai Technologies Logo Akamai Technologies

Solutions Architect

Cloud • Security • Software • Cybersecurity
In-Office or Remote
2 Locations
10285 Employees

Akamai Technologies Logo Akamai Technologies

Technical Program Manager

Cloud • Security • Software • Cybersecurity
In-Office or Remote
2 Locations
10285 Employees
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Kwun Tong, Kowloon
223,850 Employees

What We Do

Citi's mission is to serve as a trusted partner to our clients by responsibly providing financial services that enable growth and economic progress. Our core activities are safeguarding assets, lending money, making payments and accessing the capital markets on behalf of our clients. We have 200 years of experience helping our clients meet the world's toughest challenges and embrace its greatest opportunities. We are Citi, the global bank – an institution connecting millions of people across hundreds of countries and cities.

Similar Companies Hiring

Granted Thumbnail
Mobile • Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees
Scotch Thumbnail
Artificial Intelligence • eCommerce • Fintech • Payments • Retail • Software • Analytics
US
35 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account