We are seeking a highly skilled Data Pipeline Engineer to design, build, and optimize scalable data ingestion and transformation pipelines on AWS. The ideal candidate will be responsible for developing end-to-end data pipelines, implementing data quality frameworks, and enabling reliable data delivery into a governed data lake environment.
This role requires strong expertise in AWS data services, ETL development, Infrastructure as Code (IaC), and modern data engineering best practices.
- Develop and implement end-to-end data ingestion pipelines from source systems through AWS S3, AWS Glue ETL, and AWS Lake Formation curated zones.
- Design and implement ETL transformations based on defined data mapping specifications.
- Build and maintain data quality validation frameworks to ensure completeness, schema conformance, and referential integrity.
- Configure error handling, exception management, and dead-letter queue patterns for failed ingestion records.
- Register and manage datasets within AWS Glue Data Catalog, ensuring standardized metadata tagging and governance compliance.
- Develop and maintain Infrastructure as Code (IaC) using Terraform or AWS CDK for data platform resources.
- Execute unit testing and integration testing to validate pipeline functionality and data accuracy.
- Support User Acceptance Testing (UAT) by validating output datasets, schemas, and record counts with business stakeholders.
- Create and maintain technical documentation including pipeline configurations, runbooks, and data flow diagrams.
- Participate in Agile ceremonies including sprint planning, stand-ups, demos, and code reviews.
RequirementsRequired Skills & Experience
- 5+ years of experience in Data Engineering or Data Platform Development.
- Strong hands-on experience with AWS Glue, AWS Lake Formation, Amazon S3, and Amazon Athena.
- Proficiency in Python, PySpark, and SQL.
- Experience building ETL/ELT pipelines in cloud-native environments.
- Hands-on experience with Infrastructure as Code (IaC) tools such as Terraform or AWS CDK.
- Strong understanding of data modeling, data governance, and metadata management.
- Experience with automated testing and deployment practices.
- Experience with Great Expectations or similar data quality frameworks.
- Exposure to SHIP-HATS CI/CD pipelines.
- Knowledge of enterprise data cataloging and governance solutions.
- Prior experience working with Government Agencies or Public Sector projects.
- Bachelor's degree in Computer Science, Information Technology, Engineering, or a related discipline.
Benefits
Skills Required
- 5+ years of experience in Data Engineering or Data Platform Development
- Hands-on experience with AWS Glue, AWS Lake Formation, Amazon S3, and Amazon Athena
- Proficiency in Python, PySpark, and SQL
- Experience building ETL/ELT pipelines in cloud-native environments
- Hands-on experience with Infrastructure as Code (Terraform or AWS CDK)
- Strong understanding of data modeling, data governance, and metadata management
- Experience with automated testing and deployment practices
- Bachelor's degree in Computer Science, Information Technology, Engineering, or related discipline
- Existing unrestricted work authorization in Singapore (visa sponsorship not available)
- Experience with Great Expectations or similar data quality frameworks
- Exposure to SHIP-HATS CI/CD pipelines
- Knowledge of enterprise data cataloging and governance solutions
- Prior experience working with Government Agencies or Public Sector projects
What We Do
OneByZero is a consulting company specializing in artificial intelligence and data solutions, helping businesses transform and thrive in the digital age with AI-powered co-workers and data-driven consulting.









