Important Information
- Experience: More than 5 years
- Job Mode: Full-time
- Work Mode: Hybrid
Job Summary
We are looking for a Data Engineer with deep expertise in building, optimizing and managing scalable data pipelines using Azure Databricks, Spark (SQL, PySpark) and related Azure services. The ideal candidate will play a critical role in designing robust data workflows to support ingestion, transformation, and publication of data for analytics and business integrations, with a strong emphasis on performance, scalability, and governance.
Responsibilities and Duties
Data Pipeline Development
Design and implement end-to-end data pipelines using Azure Databricks.
Write efficient Spark SQL and PySpark code to transform data with integrity and accuracy.
Automate workflows using Databricks Workflows, Azure Data Factory, or Apache Airflow.
Data Ingestion & Transformation
Build scalable ingestion processes for structured, semi-structured, and unstructured data.
Develop robust transformation logic and leverage Delta Lake for versioning and incremental loads.
Data Publishing & Integration
Publish clean and transformed data to Azure Data Lake, making it available for analytics.
Define and document best practices for data pipeline development and maintenance.
Data Governance & Security
Apply governance policies using Unity Catalog, ensuring access control and metadata management.
Implement encryption and RBAC standards across data platforms for secure operations.
Performance Tuning & Optimization
Optimize Spark workloads for speed and cost-efficiency through partitioning, caching, and tuning.
Monitor data pipelines continuously, resolving bottlenecks and scaling as needed.
Automation & Monitoring
Automate infrastructure and deployments using Terraform or similar tools.
Set up proactive monitoring using Azure Monitor and Databricks native capabilities.
Qualifications and Skills
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
5+ years of experience in data engineering with a focus on Azure and Databricks ecosystems.
Strong communication and documentation skills.
Ability to work independently and collaboratively within cross-functional teams.
Role-Specific Requirements
Proven experience designing ETL/ELT pipelines in Azure Databricks.
Expertise in Spark SQL and PySpark development.
Hands-on with Databricks Workflows, Delta Lake, and Azure Data Lake.
Deep understanding of data governance, especially using Unity Catalog.
Proficient in performance tuning Spark jobs and optimizing cloud costs.
Technologies
Azure Databricks, Apache Spark, PySpark, SQL
Azure Data Lake, Azure Blob Storage, Azure Synapse
Delta Lake, Unity Catalog
Azure Data Factory, Apache Airflow
Terraform, Azure Monitor
Skillset Competencies
Data pipeline architecture and design
Cloud-based data engineering (Azure)
Performance and cost optimization
Automation and DevOps in data environments
Data governance and security compliance
Effective troubleshooting and monitoring of data workflows
About Encora
Encora is a trusted partner for digital engineering and modernization, working with some of the world’s leading enterprises and digital-native companies. With over 9,000 experts in 47+ offices worldwide, Encora offers expertise in areas such as Product Engineering, Cloud Services, Data & Analytics, AI & LLM Engineering, and more. At Encora, hiring is based on skills and qualifications, embracing diversity and inclusion regardless of age, gender, nationality, or background.
Top Skills
What We Do
Headquartered in Santa Clara, California, and backed by renowned private equity firms Advent International and Warburg Pincus, Encora is the preferred technology modernization and innovation partner to some of the world’s leading enterprise companies. It provides award-winning digital engineering services including Product Engineering & Development, Cloud Services, Quality Engineering, DevSecOps, Data & Analytics, Digital Experience, Cybersecurity, and AI & LLM Engineering. Encora's deep cluster vertical capabilities extend across diverse industries, including HiTech, Healthcare & Life Sciences, Retail & CPG, Energy & Utilities, Banking Financial Services & Insurance, Travel, Hospitality & Logistics, Telecom & Media, Automotive, and other specialized industries.
With over 9,000 associates in 47+ offices and delivery centers across the U.S., Canada, Latin America, Europe, India, and Southeast Asia, Encora delivers nearshore agility to clients anywhere in the world, coupled with expertise at scale in India. Encora’s Cloud-first, Data-first, AI-first approach enables clients to create differentiated enterprise value through technology