Data Processing Quality Engineer

Posted 12 Days Ago
Be an Early Applicant
Hyderābād, Telangāna
In-Office
5-5 Annually
Senior level
Software
DataPelago helps enterprises process data efficiently for AI and analytics with its Nucleus engine.
The Role
Ensure quality of data processing engine by focusing on accuracy, performance, and execution at scale. Requires testing expertise in various platforms.
Summary Generated by Built In

Ensure quality of the data processing engine, in terms of result accuracy, performance fidelity, and  robust execution at scale.  

Requirements  

BS EE/CS or equivalent 

5+ years of experience in data processing quality or performance testing for database,  data warehouse, or query engine applications.  Experience testing for platforms such as  Apache Spark, Gluten, Velox, DataFusion preferred.  

Solid knowledge of SQL, Python, and similar data processing languages  

Automation-first mindset, experienced with programming/scripting languages and  automation tools.  

Strong in problem-solving and coming up with the test strategy for the complex system. 
Strong in debugging, root cause, and narrowing down the failures.  

Experience in Functional, Performance, Integration, System Level testing 
Experience with the use of public cloud platforms such as AWS, GCP, and MS Azure 

Good Knowledge of tools like Jira, Confluence, Git, Jenkins. 
Good understanding of SDLC and agile methodologies. 
Good understanding of CI/CD implementations.

Top Skills

Spark
AWS
Confluence
Datafusion
GCP
Git
Gluten
Jenkins
JIRA
Ms Azure
Python
SQL
Velox
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Mountain View, CA
60 Employees
Year Founded: 2025

What We Do

DataPelago is redefining how enterprises process data for AI and analytics at scale. As organizations race to operationalize artificial intelligence, they are discovering that the greatest barrier to progress isn’t a lack of models or talent – it’s the infrastructure beneath them. Data pipelines remain fragmented across specialized systems for analytics, AI, and data engineering, each optimized for specific workloads but incapable of operating as a cohesive whole. The result is inefficiency: duplicated data, stranded compute resources, and escalating costs that slow innovation.

DataPelago was founded to solve this challenge. Its flagship product, Nucleus, is the world’s first Universal Data Processing Engine (UDPE) – a new layer that sits between data lakes and query engines to unify data processing within a single, hardware-aware stack. Built from first principles for accelerated computing, Nucleus allows companies to process, move, and activate their data orders of magnitude more efficiently than existing systems.

At its core, Nucleus dynamically orchestrates workloads across heterogeneous compute environments – CPUs, GPUs, TPUs, and FPGAs – ensuring every job runs on the optimal hardware for maximum performance and efficiency. This unified approach eliminates the need to maintain separate infrastructure for different data workloads, dramatically reducing complexity and total cost of ownership by up to 40%.

Nucleus supports structured, unstructured, and semi-structured data in a single environment, enabling AI and analytics workloads to coexist seamlessly. It integrates easily with existing data ecosystems and open-source frameworks, providing enterprises with flexibility and performance without requiring code changes or proprietary lock-in.

With Nucleus, data teams can accelerate queries, streamline pipelines, and scale AI initiatives faster, all while controlling infrastructure spend. Early adopters across industries are leveraging the platform to speed up data preparation, model training, and real-time analytics by up to 10x, turning data from a bottleneck into a competitive advantage.

DataPelago’s mission is to make high-performance, cost-efficient data processing achievable for every enterprise. By bridging the gap between data infrastructure and AI innovation, the company is helping organizations unlock the full potential of their data, laying the foundation for a new era of intelligence at scale.

Why Work With Us

DataPelago is pioneering the world’s first Universal Data Processing Engine, unifying AI and analytics in a single, hardware-aware platform. We’re solving one of the biggest challenges in enterprise AI – making data infrastructure faster, simpler, and more efficient. Join us to build the foundation for the next era of intelligent computing.

Gallery

Gallery

Similar Jobs

VISEO Logo VISEO

Consultant

Information Technology • Consulting
In-Office
Hyderābād, Telangāna, IND
2831 Employees

NymCard Logo NymCard

Junior Settlement & Reconciliation Specialist

Fintech • Payments • Software • Financial Services
In-Office
Hyderābād, Telangāna, IND
259 Employees

Johnson Controls Logo Johnson Controls

Architect

Other • Security
In-Office
Hyderābād, Telangāna, IND
100000 Employees
7-10 Annually
In-Office
Karīmnagar, Telangāna, IND

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account