Ensure quality of the data processing engine, in terms of result accuracy, performance fidelity, and robust execution at scale.
Requirements
• BS EE/CS or equivalent
• 5+ years of experience in data processing quality or performance testing for database, data warehouse, or query engine applications. Experience testing for platforms such as Apache Spark, Gluten, Velox, DataFusion preferred.
• Solid knowledge of SQL, Python, and similar data processing languages
• Automation-first mindset, experienced with programming/scripting languages and automation tools.
• Strong in problem-solving and coming up with the test strategy for the complex system.
• Strong in debugging, root cause, and narrowing down the failures.
• Experience in Functional, Performance, Integration, System Level testing
• Experience with the use of public cloud platforms such as AWS, GCP, and MS Azure
• Good Knowledge of tools like Jira, Confluence, Git, Jenkins.
• Good understanding of SDLC and agile methodologies.
• Good understanding of CI/CD implementations.
Top Skills
What We Do
We are on a mission to make it viable to extract value from all data in the world — so humanity can capture every insight, cure, invention, and opportunity.
Traditional processing solutions based on CPUs and today’s software architectures cannot handle the complexity and volume of data, doubling every two years, with unstructured data now accounting for 90% of all data created. The surge of GenAI and its dependence on huge volumes of unstructured data is compounding the processing challenge. DataPelago is creating a new data processing standard for the accelerated computing era to overcome these performance, cost and scalability limitations.
DataPelago's revolutionary Universal Data Processing Engine accelerates any engine, including open source, on any hardware, using any data type. DataPelago enables organizations to extract value from data at unprecedented price and performance for their GenAI and analytics workloads.