Expericence - 5 to 8 Years - CV with relevant skills will be contacted.Role Purpose
The Big Data Engineer is responsible for designing, developing, and maintaining scalable data pipelines and distributed data processing systems. This role enables efficient data ingestion, transformation, and analytics across batch and real-time environments, supporting enterprise-wide data initiatives.
Key Responsibilities- Design, develop, and maintain scalable batch and real-time data pipelines to support analytics and business intelligence use cases
- Build and manage distributed data processing solutions using Apache Hadoop and Apache Spark within Cloudera Data Platform (CDP)
- Develop and orchestrate ETL workflows using tools such as Apache NiFi
- Implement and manage real-time streaming pipelines using Apache Kafka
- Work with distributed storage systems such as Hadoop Distributed File System (HDFS)
- Utilize query engines like Apache Hive and Apache Impala for data access and analytics
- Perform data ingestion, transformation, and integration from multiple structured and unstructured enterprise data sources
- Optimize data pipelines for performance, scalability, and reliability
- Monitor and troubleshoot data workflows, ensuring high availability and data integrity
- Collaborate closely with data architects, analysts, and business stakeholders to deliver data solutions aligned with business needs
- Ensure adherence to data governance, security, and data quality standards
- Document data processes, architectures, and workflows for operational efficiency
- 5–8 years of experience in Big Data Engineering or related roles
- Strong hands-on experience with:
- Apache Hadoop ecosystem
- Apache Spark (PySpark/Scala preferred)
- Apache Kafka (streaming)
- Apache NiFi (data ingestion/ETL)
- Experience with Cloudera Data Platform (CDP) or similar big data platforms
- Proficiency in SQL and at least one programming language (Python, Scala, or Java)
- Solid understanding of distributed computing and parallel processing concepts
- Experience working with HDFS, Hive, and Impala
- Knowledge of data modeling, ETL design, and data warehousing concepts
- Familiarity with data governance, security, and compliance frameworks
- Strong problem-solving and performance tuning skills
- Experience with cloud platforms (AWS, Azure, or GCP) in big data environments
- Knowledge of containerization (Docker/Kubernetes) is a plus
- Exposure to CI/CD pipelines for data engineering workflows
- Understanding of real-time analytics and event-driven architectures
- Analytical thinking and problem-solving
- Strong collaboration and communication skills
- Ability to work in fast-paced, data-driven environments
- Attention to detail and commitment to data quality
- Bachelor’s or Master’s degree in Computer Science, Information Technology, Data Engineering, or a related field
Skills Required
- 5 - 8 years of experience in Big Data Engineering or related roles
- Experience with Apache Hadoop ecosystem
- Experience with Apache Spark (PySpark/Scala preferred)
- Experience with Apache Kafka (streaming)
- Experience with Apache NiFi (data ingestion/ETL)
- Experience with Cloudera Data Platform (CDP) or similar big data platforms
- Proficiency in SQL and at least one programming language (Python, Scala, or Java)
Datamatics Technologies Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Datamatics Technologies and has not been reviewed or approved by Datamatics Technologies.
-
Flexible Benefits — Feedback suggests flexible timings and work-from-home options are available in some roles. This flexibility is highlighted as part of the employment experience across certain postings and materials.
-
Wellbeing & Lifestyle Benefits — Feedback suggests flexibility around time off and remote work supports work-life balance. These elements can help offset leaner cash components for some individuals.
Datamatics Technologies Insights
What We Do
Datamatics Technologies (DMT) was established in Dubai. We specialize in providing onsite and offshore professional services, covering the full spectrum of Data Analytics and Data Science domains. Our experience of working with diverse industry sectors such as Telecoms, Finance, Government and Manufacturing, across multiple regions enables us to engage and deliver for our clients with confidence. We can offer our full portfolio of services through resource augmentation, managed services, both on T&M or fixed price financial arrangements. Through our end-to-end managed services offering we enable our clients to cut down costs, increase profitability and focus on value addition to their core business activities. Our project and delivery management team are certified in Agile, PMI and ITIL to ensure the planning and execution are carried out using industry best practices. We are working with our clients across Middle East and Africa Region.









