Software Engineering - Data, Lakehouse and AI Data Platform Engineer-Bangalore-Associate

Posted Yesterday
Be an Early Applicant
Bengaluru, Bengaluru Urban, Karnataka, IND
In-Office
Mid level
Fintech • Financial Services
The Role
Design, build, test and support batch and streaming data pipelines and curated datasets on a lakehouse/AI data platform. Implement data modelling, partitioning and schema evolution, ensure data quality and reconciliation, and build reusable tooling. Collaborate with stakeholders to deliver production-ready, well-tested data products and improve platform capabilities.
Summary Generated by Built In

At Goldman Sachs, engineering teams are positioned at the centre of the business, building scalable systems, solving complex technical problems and turning data into action. In data engineering roles, the emphasis is on designing, building and maintaining large-scale data platforms, delivering production pipelines, improving reliability and quality, and partnering closely with users of the platform.

This is a delivery-focused role for engineers who want to build robust data assets in production, work with modern data technologies, and grow over time within the firm. You will contribute to the data models, pipelines and platform capabilities that underpin analytics, operational decision-making and emerging AI use cases, and may also help extend platform tooling where additional functionality is needed.

Role Summary

As a Data Engineer in the Lakehouse and AI Data Platform team, you will design, build, test and support data pipelines and curated datasets on the firm’s modern data platform. You will work across ingestion, transformation, modelling, optimisation and data quality, helping to deliver data products that are reliable, scalable and fit for purpose.  Where there are gaps in platform functionality, you may also contribute to shared tooling or framework components that improve how the platform is used and operated.

The role is suited to engineers who are comfortable writing code, working with SQL and distributed data processing, and solving practical delivery problems in a team environment. More experienced candidates may also contribute to technical design, platform standards and the shaping of delivery approaches across a wider set of use cases.

Key Responsibilities

Pipeline Engineering

  • Build, enhance and support batch and streaming data pipelines on the Lakehouse and AI data platform.
  • Refactor or modernise existing data flows where needed to improve reliability, performance and maintainability.
  • Where needed, build reusable tooling to improve delivery, consistency and operational support.
  • Ensure data pipelines are production-ready, well tested and operationally supportable.

Data Modelling and Curation

  • Develop raw, refined and curated datasets that support analytics, reporting and AI use cases.
  • Apply sound data modelling principles to represent business entities, relationships and historical change accurately.
  • Work with consumers to shape data products that are usable, well documented and aligned to business needs.

Data Quality and Reconciliation

  • Implement controls to validate completeness, accuracy and consistency of data across pipelines and datasets.
  • Use reconciliation approaches to build confidence in production outputs and investigate breaks where they arise.
  • Contribute to clear standards for testing, monitoring and issue resolution.
  • Contribute to practical improvements in testing, monitoring or reconciliation tooling where these strengthen platform reliability and day-to-day delivery.

Skills and Experience

Required

  • 3+ years of experience
  • Bachelor’s or master’s degree in a relevant discipline, or equivalent practical experience, with evidence of strong quantitative skills or data engineering expertise.
  • Strong hands-on programming experience in Python or Java.
  • Good working knowledge of SQL, including troubleshooting, optimization and data analysis.
  • Ability to learn new tools, internal platforms and delivery workflows quickly.
  • Familiarity with software engineering fundamentals, including version control, testing, release discipline and CI/CD practices.

Data Engineering Capability

  • Understanding of temporal data modelling, including the handling of historical state and change over time.
  • Knowledge of schema design, schema evolution and data compatibility considerations.
  • Understanding of partitioning, clustering and other techniques used to improve data performance at scale.
  • Ability to make sensible design choices across normalized and denormalized models, and between natural and surrogate keys.
  • Practical approach to data quality, reconciliation and root-cause analysis.
  • Experience building or supporting production data pipelines in a collaborative engineering environment.
  • Experience working with distributed data processing frameworks such as Apache Spark.
  • Working knowledge of common data formats such as JSONAvro and Parquet.
  • Stronger ownership of technical design across multiple datasets or pipeline domains.
  • Experience guiding implementation standards, code quality and engineering practices within a team.
  • Ability to lead delivery for a workstream, manage dependencies and support less experienced engineers.

Technology Environment

The role will involve working with a modern and evolving data stack. Candidates are not expected to have deep expertise in every tool from day one but should bring relevant experience and the ability to work across comparable technologies.

Examples of technologies in scope include:

  • Data processing and logic: ANSI SQL, Apache Spark, Kafka
  • Data formats: JSON, Avro, Parquet
  • Platforms and storage: Snowflake, Apache Iceberg, Databricks, Hadoop ecosystem technologies, Sybase IQ
  • Engineering and deployment: CI/CD tooling, containerized or Kubernetes-based deployment approaches where relevant

You will also work with internal data management and platform tooling, so a practical and adaptable engineering mindset is important.


What We Are Looking For

We are looking for engineers who can deliver well-structured, reliable solutions in production and who take ownership of the quality of what they build. The role suits candidates who are technically strong, pragmatic and comfortable working in a fast-paced environment where data platforms support important business outcomes.

Stronger candidates will typically demonstrate:

  • sound judgement in technical trade-offs
  • attention to detail in data correctness and testing
  • a clear and structured approach to problem solving
  • willingness to work closely with stakeholders and partner teams
  • an interest in developing long-term expertise within the firm

Skills Required

  • 3+ years of experience
  • Bachelor's or Master's degree in a relevant discipline or equivalent practical experience
  • Strong hands-on programming experience in Python or Java
  • Good working knowledge of SQL (ANSI SQL), including troubleshooting and optimization
  • Experience building or supporting production batch and streaming data pipelines
  • Experience with distributed data processing frameworks such as Apache Spark
  • Familiarity with Kafka for streaming data
  • Working knowledge of data formats JSON, Avro and Parquet
  • Familiarity with data platform technologies (examples: Snowflake, Apache Iceberg, Databricks, Hadoop, Sybase IQ)
  • Familiarity with software engineering fundamentals: version control, testing, release discipline and CI/CD practices
  • Ability to learn new tools, internal platforms and delivery workflows quickly
  • Understanding of temporal data modelling, schema design/evolution, partitioning and clustering techniques
  • Practical approach to data quality, reconciliation, monitoring and root-cause analysis
  • Ability to lead delivery for a workstream and support less experienced engineers

Goldman Sachs Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Goldman Sachs and has not been reviewed or approved by Goldman Sachs.

  • Healthcare Strength Coverage includes medical, dental, vision, disability, life and accident insurance, with multiple plan options and most premiums subsidized; coverage often starts on day one. Wellness resources, on-site health centers in some locations, and EAP access reinforce the depth of health support.
  • Parental & Family Support Family care includes on-site childcare in some offices, expectant parent resources, and transitional programs for returning parents. Feedback suggests parental leave is very generous, with reports of around 20 weeks paid leave and stipends for adoption, surrogacy, and fertility-related services.
  • Retirement Support The firm provides a 401(k) plan with employer matching contributions and broad financial education to help employees plan for retirement. Resources also support saving for education and preparing for unexpected events.

Goldman Sachs Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
67,118 Employees

What We Do

At Goldman Sachs, we believe progress is everyone’s business. That’s why we commit our people, capital and ideas to help our clients, shareholders and the communities we serve to grow. Founded in 1869, Goldman Sachs is a leading global investment banking, securities and investment management firm. Headquartered in New York, we maintain offices in all major financial centers around the world. More about our company can be found at www.goldmansachs.com

Similar Jobs

Optum Logo Optum

Software Engineering Lead

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
160000 Employees

Cleo Logo Cleo

EDI - Technical Solutions Manager

Cloud • eCommerce • Information Technology • Professional Services • Software
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
500 Employees

Ericsson Logo Ericsson

Senior Engineer

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
88000 Employees

LogicMonitor Logo LogicMonitor

Software Engineer

Artificial Intelligence • Cloud • Information Technology • Machine Learning • Software
Easy Apply
Hybrid
2 Locations
1100 Employees
3-3 Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account