We are seeking a highly skilled and motivated Big Data Engineer to join our dynamic team. The ideal candidate will have extensive experience in designing, developing, and optimizing scalable data solutions using the Hadoop ecosystem, with a strong focus on PySpark and Hive. This role is crucial for building robust ETL pipelines, ensuring data quality, and driving performance improvements across our Big Data initiatives.
Key ResponsibilitiesDesign, develop, and maintain efficient and scalable Big Data solutions using PySpark, Apache Hive, and Hadoop ecosystem tools (e.g., Sqoop).
Should have strong Python knowledge
Implement and optimize ETL (Extract, Transform, Load) processes and data warehousing solutions, including Fact, Dimension, and Slowly Changing Dimensions (SCD-2).
Conduct in-depth data analysis, troubleshoot complex data issues, and ensure the accuracy, reliability, and integrity of data.
Optimize Big Data workflows, including Spark job tuning and Hive query optimization, leveraging partitioning strategies and indexing techniques in distributed storage systems.
Perform rigorous unit testing and validation of data pipelines and transformations.
Collaborate with data scientists, analysts, and other engineers to understand data requirements and deliver robust data solutions.
Big Data Technologies: Demonstrated proficiency with Apache Hadoop, Apache Hive, and PySpark for data processing and analysis.
Data Warehousing & Modeling: Strong understanding and practical experience with data warehousing concepts, dimensional modeling, and SCD-2 implementation.
ETL Development: Proven experience in designing and developing ETL pipelines; familiarity with various ETL tools is an advantage.
Database & SQL: Advanced SQL knowledge, including complex joins, subqueries, and performance tuning of SQL queries.
Scripting: Proficient in shell scripting for automation of batch processes.
DevOps & CI/CD: Experience with CI/CD tools such as Bitbucket and Jenkins.
BI Tools: Familiarity with business intelligence (BI) reporting tools like Tableau.
Experience and/or certifications with major cloud platforms and their Big Data services (e.g., AWS, Azure Databricks, Google Cloud).
Advanced knowledge of Unix shell scripting for system administration and automation.
Excellent critical thinking and problem-solving skills with a strong analytical mindset.
Ability to work independently and collaboratively in a fast-paced environment.
Strong communication skills to articulate technical concepts and solutions effectively.
Education:
Bachelor’s degree/University degree or equivalent experience
If you are a passionate Big Data Engineer looking to make a significant impact, we encourage you to apply!
This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.
------------------------------------------------------
Job Family Group: Technology------------------------------------------------------
Job Family:Applications Development------------------------------------------------------
Time Type:Full time------------------------------------------------------
Most Relevant Skills Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills For complementary skills, please see above and/or contact the recruiter.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.
Skills Required
- Extensive experience with Hadoop ecosystem
- Strong focus on PySpark and Hive
- Advanced SQL knowledge
- Experience with ETL development
- Proficient in shell scripting
- Experience with CI/CD tools such as Bitbucket and Jenkins
- Bachelor's degree or equivalent experience
Citi Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Citi and has not been reviewed or approved by Citi.
-
Healthcare Strength — Benefits coverage is positioned as comprehensive, including health, dental, and vision insurance plus on-site clinics, prescription drug support, and disability coverage. Family-building support such as fertility assistance is described as a notable differentiator within the overall package.
-
Retirement Support — Retirement benefits are framed as strong, highlighted by a 401(k) with matching and additional plan options like a Roth 401(k). Financial support is reinforced through discounts and broader financial guidance resources tied to the benefits ecosystem.
-
Wellbeing & Lifestyle Benefits — Wellbeing support extends beyond insurance through programs like an Employee Assistance Program, counseling/legal resources, and gym or wellness reimbursement. These offerings increase the perceived total rewards value even when cash compensation sentiment varies by role.
Citi Insights
What We Do
Citi's mission is to serve as a trusted partner to our clients by responsibly providing financial services that enable growth and economic progress. Our core activities are safeguarding assets, lending money, making payments and accessing the capital markets on behalf of our clients. We have 200 years of experience helping our clients meet the world's toughest challenges and embrace its greatest opportunities. We are Citi, the global bank – an institution connecting millions of people across hundreds of countries and cities.








