About the Role
We are seeking a highly skilled and experienced Data Lake Cloud Engineer with a proven track record of designing, implementing, and maintaining large-scale cloud-based data lake platforms. This role requires a professional who can take ownership of our current data lake ecosystem, optimize its performance, and drive future enhancements with minimal oversight. The ideal candidate will have at least 5 years of hands-on experience in building enterprise-grade data lakes, strong cloud architecture expertise, and the ability to work with cutting-edge data ingestion, processing, and analytics tools.
Key Responsibilities
- Take ownership of the existing enterprise data lake platform, ensuring scalability, reliability, and performance.
- Lead the design, architecture, and implementation of cloud-native data lake solutions and integrations.
- Manage and optimize data ingestion pipelines on Oracle OCI, using tools such as Apache NiFi, Kafka, Batch Processing of data, Data captures, and or CSV.
- Design and implement pipelines for network data ingestion and file formats (e.g., Parquet, Avro, OCR, etc.), ensuring efficient storage, processing, and retrieval.
- Build, configure, and tune query engines such as Trino (Presto), Spark, and Hive for efficient analytics and reporting.
- Implement and maintain metadata management, data governance, and security frameworks.
- Monitor and troubleshoot system performance, ensuring SLAs are met for ingestion, processing, and query workloads.
- Automate platform deployment, monitoring, and maintenance with Infrastructure-as-Code (Terraform, CloudFormation, etc.).
- Collaborate with data engineers, analysts, and business teams to understand data requirements and deliver solutions that maximize data accessibility and usability.
- Keep the data platform up to date with the latest open-source and cloud-agnostic technologies, implementing upgrades and enhancements where needed.
Requirements
5+ years of proven, hands-on experience implementing and managing large-scale data lakes in the cloud (OCI). Strong expertise in:
- Data ingestion & orchestration: Apache NiFi, Apache Kafka, CSV, and others
- Data processing frameworks: Apache Spark, PySpark, Trino (Presto), Hive, Flink.
- Storage & lakehouse architectures: Delta Lake, Apache Hudi, Iceberg, and cloud-native object storage (S3).
- Query & analytics tools: Trino/Presto, SparkSQL, Metabase, or Apache Superset.
- Experience with data lake file formats such as Apache Parquet, Avro, ORC, CSV, etc. including ingestion, parsing, and analytics within a data lake.
- Solid understanding of data governance, lineage, cataloging, and security frameworks (Apache Atlas).
- Experience with CI/CD and IaC (ArgoCD, Terraform, Ansible) for automated deployments.
- Hands-on experience with cloud security best practices, including IAM, encryption, and network security.
- Strong proficiency in Python or Java for data engineering and automation tasks.
- Proven ability to work independently, quickly understand existing environments, and deliver results without extensive training.
Preferred Skills
- Exposure to machine learning workflows integrated with data lakes.
- Experience with real-time streaming data pipelines.
- Familiarity with containerization and orchestration (Docker, Kubernetes).
- Knowledge of cost optimization strategies in cloud-based data platforms.
Skills Required
- 5+ years of experience managing large-scale data lakes
- Experience with Apache NiFi, Kafka, Spark, Trino, Hive
- Proficiency in Python or Java for data tasks
- Hands-on experience with CI/CD and IaC tools
- Understand data governance and cloud security best practices
What We Do
emaratech, owned by the Investment Corporation of Dubai, is the leading technology and management consulting company in the Arab World that provides high end market strategies, online solutions, outsourced technology & advanced business information technology solutions for private & public sectors. Business Domain Knowledge includes border access & control, Government eServices, Security Services, Payment Gateways, ERP and real estate technology solutions. Services include product engineering, enterprise and custom online applications, systems integration, infrastructure hosting, business process re-engineering, IT & business consulting, quality management, project management and managed services. emaratech's subsidiaries include: noqodi A comprehensive, technologically advanced payment gateway system that provides complete automated collections, reconciliation and settlement services www.noqodi.com Emirates Real Estate Solutions A world-class real estate product registered in 17 countries, providing models focused on application solutions and managed services. www.eres.ae Digital Economy Solutions A joint venture between emaratech and the Department of Economic Development of Dubai, with a mission to develop strategic digital projects with a focus on developing and issuing licenses for all businesses in Dubai. ZAJEL A courier company that not only delivers courier but embeds itself seamlessly into business process that is incomplete without delivery and collection Emirates Real Estate Solutions (ERES) Developing and managing IT products including Tenancy, Contract Management, Land ownership and unlisted property management solutions and portals. If you are talented in any of the above fields and looking for a place to put your stamp in an innovative environment, we would like to hear from you. submit your resume and let us know how you can make a difference at emaratech.








.png)