Senior Data Engineer (Databricks)

Posted 5 Days Ago
Be an Early Applicant
Bangalore, Bengaluru Urban, Karnataka
In-Office
Senior level
Artificial Intelligence • eCommerce • Machine Learning • Software
The Role
Design, build, and optimize scalable data pipelines using Databricks. Ensure data quality, implement CI/CD, and collaborate with teams to drive data engineering excellence.
Summary Generated by Built In
Position Overview 
At Hypersonix, we are building the leading Generative AI Platform for Commerce. Our Flagship GenAI Product – Competitor + Pricing AI – scrapes the product catalogue for our Enterprise customers and their competitors, and uses RAG to identify the nearest competitive match for each of our customer's product, facilitating intelligent pricing strategies that were previously impossible to achieve. 
We are seeing strong growth in our Enterprise product, and are building out an end-to-end product on Databricks for Shopify Store owners, specializing in Agentic workflows that automate critical business processes (Pricing + Promotion Strategies, Inventory Management, and Competitive Intelligence). 
We are seeking an experienced Senior Data Engineer to design, build, and optimize scalable data pipelines and infrastructure. The ideal candidate will have deep expertise in Databricks and modern data engineering practices, with a strong focus on building robust, production-grade data solutions that drive business value while maintaining cost efficiency. 
Key Responsibilities: 
Data Platform Development 
Design and implement enterprise-scale data pipelines using Databricks on AWS, leveraging both cluster-based and serverless compute paradigms 
Architect and maintain medallion architecture (Bronze/Silver/Gold) data lakes and lakehouses 
Develop and optimize Delta Lake tables for ACID transactions and efficient data management 
Build and maintain real-time and batch data processing workflows 
Engineering Excellence 
Create reusable, modular data transformation logic using DBT to ensure data quality and consistency across the organization 
Develop complex Python applications for data ingestion, transformation, and orchestration 
Write optimized SQL queries and implement performance tuning strategies for large-scale datasets 
Implement comprehensive data quality checks, testing frameworks, and monitoring solutions 
Cost Management & Optimization 
Monitor and analyze Databricks DBU (Databricks Unit) consumption and cloud infrastructure costs 
Implement cost optimization strategies including cluster right-sizing, autoscaling configurations, and spot instance usage 
Optimize job scheduling to leverage off-peak pricing and minimize idle cluster time 
Establish cost allocation tags and chargeback models for different teams and projects 
Conduct regular cost reviews and provide recommendations for efficiency improvements 
DevOps & Infrastructure 
Design and implement CI/CD pipelines for automated testing, deployment, and rollback of data artifacts 
Configure and optimize Databricks clusters, job scheduling, and workspace management 
Implement version control best practices using Git and collaborative development workflows 
Collaboration & Leadership 
Partner with data analysts, data scientists, and business stakeholders to understand requirements and deliver solutions 
Mentor junior engineers and promote best practices in data engineering 
Document technical designs, data lineage, and operational procedures 
Participate in code reviews and contribute to team knowledge sharing 
Required Qualifications: 
Technical Skills 
5+ years of experience in data engineering roles 
Expert-level proficiency in Databricks (Unity Catalog, Delta Live Tables, Workflows, SQL Warehouses) 
Strong understanding of cluster configuration, optimization, and serverless SQL compute 
Advanced SQL skills including query optimization, indexing strategies, and performance tuning 
Production experience with DBT (models, tests, documentation, macros, packages) 
Proficient in Python for data engineering (PySpark, pandas, data validation libraries) 
Hands-on experience with Git workflows (branching strategies, pull requests, code reviews) 
Proven track record implementing CI/CD pipelines (Jenkins, GitLab CI) 
Working knowledge of Snowflake architecture and migration patterns 
Additional Technical Skills: 
Experience with Apache Spark and PySpark optimization techniques (caching, partitioning, broadcast joins) 
Understanding of data modeling concepts (dimensional modeling, data vault, normalization) 
Knowledge of orchestration tools (Airflow, Databricks Workflows) 
Familiarity with cloud platforms (AWS, Azure, or GCP) and their data services 
Experience with data governance , security frameworks and SOC2 
Ability to use Databricks system tables and monitoring tools for cost analysis 
Professional Skills: 
Strong problem-solving abilities and analytical thinking 
Excellent communication skills with technical and non-technical audiences 
Ability to work independently and drive projects to completion 
Experience with Agile/Scrum methodologies 
Cost-conscious mindset with ability to balance performance and budget constraints 
Preferred Qualifications: 
Databricks certifications (Data Engineer Associate/Professional, Platform Administrator) 
Knowledge of data quality frameworks (Great Expectations, Deequ) 
Experience with container technologies (Docker, Kubernetes) 
Familiarity with Lakehouse architecture patterns and best practices 
Experience migrating from traditional data warehouses to modern data platforms 
Nice to Have: 
Familiarity with e-commerce and retail domain 
Experience with reverse ETL and data activation tools 
Knowledge of data catalog and metadata management tools 
Track record of achieving significant cost reductions (20%+) in cloud data platforms 
Experience with data mesh or data fabric architectural patterns 
Familiarity with Power BI, Tableau, or other BI tools 

Top Skills

Airflow
Spark
AWS
Databricks
Dbt
Git
Gitlab Ci
Jenkins
Pyspark
Python
Snowflake
SQL
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Jose, CA
81 Employees
Year Founded: 2019

What We Do

Hypersonix is the Enterprise AI Platform purpose-built for Ecommerce. It acts as an enterprise nervous system that senses and reacts to market conditions and customer demand for optimal results and profitable revenue. It looks at all available data to provide real-time, actionable intelligence. Hypersonix users take advantage of instantaneous changes in customer demand, supply chain fluctuations, inventory surpluses and shortfalls, and forecast and adjust pricing, optimize inventory, monitor supply chains, and drive profitable revenue growth.

Similar Jobs

DXC Technology Logo DXC Technology

Data Engineer

Information Technology
In-Office
Bangalore, Bengaluru, Karnataka, IND
86261 Employees

Rearc Logo Rearc

Senior Data Engineer

Information Technology • Consulting
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
51 Employees

Magna International Logo Magna International

Senior Databricks Engineer, Data Platform & Data Science Tooling

Automotive • Hardware • Robotics • Software • Transportation • Manufacturing
Hybrid
Bangalore, Bengaluru Urban, Karnataka, IND
171000 Employees

Similar Companies Hiring

Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account