Senior Data Engineer

Posted Yesterday
Be an Early Applicant
Hiring Remotely in Warsaw, Warszawa, Masovian, POL
In-Office or Remote
Senior level
Software
The Role
Lead design and delivery of cloud-native, AI-ready data platforms and near-real-time ingestion pipelines. Optimize Spark/PySpark code, define platform standards (lakehouse, medallion), enforce data quality/observability, drive CI/CD and IaC adoption, and enable self-service tooling while collaborating with ML, data science, and product teams.
Summary Generated by Built In
Company Description

Are you passionate about building cutting-edge, AI-ready data platforms from the ground up? We are looking for a Senior Data Engineer to join our Data Engineering Team and lead high-impact, greenfield initiatives.

You will work on building modern cloud-native data platforms, migrating on-premises legacy systems to the cloud, and laying the architectural foundation for AI-ready data infrastructure. 

In this role, you will collaborate closely with Machine Learning, Data Science, and Product teams, serving as a key technical contributor and thought leader. You will also drive R&D efforts around agentic AI architectures, event-driven systems, and LLM-ready data pipelines – turning architectural concepts into production-grade solutions.

Job Description

  • Design and build scalable, cloud-native data platforms from greenfield to production
  • Implement near-real-time ingestion pipelines using event-driven patterns
  • Define and enforce platform standards, including Data Lake / Lakehouse principles, medallion architecture, and data contracts
  • Refactor and optimise existing Spark and PySpark scripts for performance and maintainability
  • Introduce best practices for code quality, testing, and CI/CD across data pipelines
  • Drive adoption of AI tooling and agentic workflows within the data engineering team
  • Ensure data quality, observability, and reliability across all pipelines and platforms
  • Develop self-service tooling and microservices to simplify platform usage for other teams

Qualifications

  • 5+ years of professional experience in Data Engineering
  • Strong Python and SQL development skills for pipeline development and optimisation
  • Proficiency in Apache Spark / PySpark, including query optimisation and performance tuning
  • Hands-on experience with Databricks (preferred) or Snowflake
  • Experience with at least one major cloud provider: Azure (preferred), AWS, or GCP
  • Experience with stream processing technologies (Kafka, Spark Structured Streaming)
  • Solid understanding of ETL/ELT patterns, data modelling (dimensional, Data Vault), and data warehousing
  • Experience with orchestration tools (Apache Airflow, Azure Data Factory, or equivalent)
  • Knowledge of Infrastructure as Code (Terraform or equivalent)
  • Understanding of production-grade system requirements: reliability, scalability, observability, and performance
  • Upper-Intermediate English level

WILL BE A PLUS

  • Familiarity with RAG pipeline design and LLM integration patterns
  • Knowledge of data governance frameworks and tools (Unity Catalog, Apache Atlas, or similar)
  • Experience with dbt for data transformation and modelling
  • Familiarity with MLflow, Feature Stores, or ML platform integration

Additional Information

PERSONAL PROFILE

  • Self-driven and proactive in identifying improvements
  • Comfortable working in a fast-paced, innovative environment
  • Strong problem-solving mindset with attention to detail
  • Open to experimenting with emerging technologies and approaches

Skills Required

  • 5+ years of professional experience in Data Engineering
  • Strong Python development skills
  • Strong SQL development skills
  • Proficiency in Apache Spark / PySpark, including query optimisation and performance tuning
  • Hands-on experience with Databricks or Snowflake
  • Experience with at least one major cloud provider (Azure preferred, AWS or GCP)
  • Experience with stream processing technologies (Kafka, Spark Structured Streaming)
  • Solid understanding of ETL/ELT patterns, data modelling (dimensional, Data Vault), and data warehousing
  • Experience with orchestration tools (Apache Airflow, Azure Data Factory, or equivalent)
  • Knowledge of Infrastructure as Code (Terraform or equivalent)
  • Experience with CI/CD for data pipelines and testing best practices
  • Upper-Intermediate English level
  • Hands-on experience with Databricks
  • Familiarity with RAG pipeline design and LLM integration patterns
  • Knowledge of data governance frameworks and tools (Unity Catalog, Apache Atlas)
  • Experience with dbt for data transformation and modelling
  • Familiarity with MLflow, Feature Stores, or ML platform integration
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
New York, New York
1,516 Employees

What We Do

Sigma Software Group, an award-winning and trusted IT partner, has been serving customers for over 21 years, providing comprehensive IT solutions to various businesses, ranging from startups to established software product houses. As one of Europe's substantial IT consultancies, it brings together a dedicated workforce of over 2,100 professionals in 40 offices across 19 countries. With a diverse client base, including more than 300 enterprises, including Fortune 500 stalwarts, Sigma Software Group is a preferred choice for developing solutions that help businesses create cutting-edge products while meeting their unique needs. Sigma Software Group operates as a dynamic ecosystem of tech companies, offering 25 ready-to-implement innovative products and 40+ value-added services. Furthermore, Sigma Software Group is committed to fostering innovation through initiatives such as the Sigma Software Labs business incubator, Sigma Software University, the SID Venture Partners VC Fund, UA Tech Network, Techosystem, the European Business Association, and other collaborative efforts. Since 2015, Sigma Software Group has consistently earned recognition on the IAOP's prestigious World's Top 100 Outsourcing list. The company's accomplishments have also been acknowledged by prominent global media outlets such as Forbes, CNBC, The Times, and Reuters

Similar Jobs

RE Partners Logo RE Partners

Senior Data Engineer

Information Technology • Business Intelligence • Consulting
Remote
4 Locations
97 Employees

MWDN Logo MWDN

Senior Data Engineer

Information Technology • Consulting
Remote
Poland
143 Employees
In-Office or Remote
Warsaw, Warszawa, Masovian, POL
1516 Employees

Dropbox Logo Dropbox

Senior Data Engineer

Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
Remote
Poland
2500 Employees
240K-324K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account