Data Architect

Posted 4 Days Ago
Be an Early Applicant
Pune, Mahārāshtra, IND
In-Office
Senior level
Artificial Intelligence • Information Technology • Professional Services • Software
The Role
Design and maintain Databricks-based lakehouse architecture; build and optimize production ETL/ELT pipelines with PySpark/Spark SQL and DLT; implement governance (Unity Catalog, data quality), CI/CD, and infrastructure-as-code; collaborate with stakeholders and mentor engineers to deliver scalable, reliable data products.
Summary Generated by Built In

Data Architect — Databricks

Data Engineering & Pipelines  |  Mid-Level  |  Full-Time

Experience

5 – 8 Years

Level

Mid-Level 

Employment Type

Full-Time

Location

Pune - Hybrid

Primary Stack

Databricks, Apache Spark, Delta Lake, SQL

Domain

Data Engineering & Pipelines

About the Role

We are looking for a hands-on Data Architect with deep expertise in Databricks to design, build, and optimise enterprise-scale data platforms. You will own the end-to-end data engineering lifecycle — from ingestion and transformation to serving — while ensuring reliability, scalability, and governance across our lakehouse architecture.

You will collaborate closely with data engineers, analytics engineers, and product teams to translate business requirements into robust, reusable data solutions on the Databricks Lakehouse Platform.

Key Responsibilities

Data Architecture & Design

      Design and maintain the organisation's lakehouse architecture using Databricks and Delta Lake.

      Define data modelling standards (dimensional, Data Vault 2.0, or medallion architecture) across Bronze, Silver, and Gold layers.

      Architect scalable ingestion frameworks using structured and unstructured data sources (Kafka, JDBC, REST APIs, cloud storage).

      Own schema evolution strategy and ensure backward-compatibility across data assets.

Pipeline Development & Optimisation

      Build and maintain production-grade ETL/ELT pipelines using PySpark, Spark SQL, and Databricks Workflows.

      Implement Delta Live Tables (DLT) for declarative, auto-scaling pipeline development.

      Optimise Spark jobs for performance — partitioning, Z-ordering, caching, and cluster right-sizing.

      Establish CI/CD practices for data pipelines using tools such as GitHub Actions, Azure DevOps, or Databricks Asset Bundles.

Data Governance & Quality

      Implement Unity Catalog for data discovery, lineage tracking, fine-grained access control, and compliance.

      Define and enforce data quality rules using Great Expectations, DLT expectations, or equivalent frameworks.

      Work with data governance teams to document metadata, business glossary, and data contracts.

Platform & Infrastructure

      Manage Databricks workspace configuration: clusters, pools, secrets, and access policies.

      Collaborate with cloud and DevOps teams on infrastructure-as-code (Terraform) for Databricks on Azure / AWS / GCP.

      Monitor platform health, SLAs, and cost using Databricks system tables and cloud-native monitoring tools.

Collaboration & Mentorship

      Partner with data consumers (analysts, data scientists, ML engineers) to define SLAs and publish clean, well-documented data products.

      Review code and provide architectural guidance to junior engineers.

      Contribute to and champion internal data engineering best practices, runbooks, and documentation.

Required Skills & Experience

Core Databricks & Spark

      4+ years of hands-on experience with Databricks (Unified Data Analytics Platform).

      Strong proficiency in PySpark and Spark SQL for large-scale data transformation.

      Deep knowledge of Delta Lake — ACID transactions, time travel, OPTIMIZE, VACUUM.

      Experience with Databricks Workflows, Jobs, and Delta Live Tables (DLT).

      Familiarity with Unity Catalog and Databricks governance features.

Data Engineering Fundamentals

      Solid understanding of data modelling paradigms: dimensional modelling, Data Vault, or medallion architecture.

      Experience designing and operating streaming pipelines (Structured Streaming, Kafka, Event Hubs, or Kinesis).

      Proficiency in SQL; experience with dbt is a strong plus.

      Hands-on experience with cloud platforms: Azure (ADLS, ADF), AWS (S3, Glue), or GCP (BigQuery, GCS).

Software Engineering Practices

      Version control with Git; experience with branching strategies and code review workflows.

      Ability to write testable, modular pipeline code with unit and integration tests.

      Familiarity with CI/CD pipelines and infrastructure-as-code (Terraform preferred).

Nice to Have

      Databricks Certified Data Engineer Associate or Professional certification.

      Experience with data mesh or data product frameworks.

      Exposure to ML pipelines, MLflow, or Feature Store on Databricks.

      Knowledge of data cataloguing tools (Alation, Collibra, or Databricks Unity Catalog).

      Experience with Apache Iceberg or Apache Hudi as alternative table formats.

      Familiarity with real-time analytics or OLAP systems (Druid, ClickHouse, Redshift).

What We Offer

      Competitive salary with performance-linked bonus.

      Flexible / hybrid working arrangements.

      Access to Databricks training and certification budget.

      Collaborative, engineering-first data culture with modern tooling.

      Clear career progression path to Senior Data Architect or Data Platform Lead.

      Comprehensive health, wellness, and retirement benefits.

 




Skills Required

  • 4+ years hands-on experience with Databricks
  • Proficiency in PySpark and Spark SQL for large-scale transformations
  • Deep knowledge of Delta Lake (ACID, time travel, OPTIMIZE, VACUUM)
  • Experience with Databricks Workflows, Jobs, and Delta Live Tables (DLT)
  • Familiarity with Unity Catalog and Databricks governance features
  • Designing data models (dimensional, Data Vault 2.0, medallion architecture)
  • Experience designing and operating streaming pipelines (Structured Streaming, Kafka/Event Hubs/Kinesis)
  • Proficiency in SQL; experience with dbt
  • Hands-on experience with cloud platforms and storage (Azure ADLS/ADF, AWS S3/Glue, or GCP BigQuery/GCS)
  • Version control with Git and familiarity with CI/CD (GitHub Actions, Azure DevOps) and code review workflows
  • Ability to write testable, modular pipeline code with unit and integration tests
  • Experience with infrastructure-as-code for Databricks (Terraform)
  • Implement and enforce data quality rules using Great Expectations, DLT expectations, or equivalent
  • Databricks Certified Data Engineer (Associate or Professional)
  • Experience with data mesh, data products, MLflow, Feature Store, or data catalog tools (Alation, Collibra)
  • Exposure to alternative table formats (Apache Iceberg, Apache Hudi) and real-time/OLAP systems (Druid, ClickHouse, Redshift)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
0 Employees
Year Founded: 2012

What We Do

nCircle Tech is a development partner specializing in 2D/3D product development, custom software development, and BIM services for the AEC and Manufacturing sectors, focusing on 3D visualization, CAD/BIM customization, and AI-driven automation.

Similar Jobs

Accenture Logo Accenture

Data Architect

Information Technology
In-Office
Pune, Mahārāshtra, IND
456553 Employees

NextHire Consulting Logo NextHire Consulting

Product Specialist

Artificial Intelligence • HR Tech • Professional Services • Software
In-Office
Nashik, Mahārāshtra, IND
100 Employees

Accenture Logo Accenture

Architect

Information Technology
In-Office
2 Locations
456553 Employees

Accenture Logo Accenture

Architect

Information Technology
In-Office
2 Locations
456553 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account