The Data Engineering team builds and operates the analytical data platform that powers machine learning, data science, analytics, and reporting across Veriff. We are responsible for large-scale data ingestion, platform reliability, and enterprise data governance — ensuring Veriffians have access to accurate and timely data.
In this role, you will own and evolve our data lake and data warehouse infrastructure, driving platform-level data management, governance, and reliability at scale.
You'll help us protect honest people online by:
- Owning and evolving our data lake and data warehouse infrastructure using technologies such as Spark, Apache Iceberg, S3, Trino/Athena, and Redshift.
- Designing and maintaining platform-level data transformation pipelines in Python and SQL — focused on schema evolution, partitioning, compaction, and deduplication.
- Implementing optimized storage formats (Parquet, Avro, ORC), partitioning strategies, and indexing to improve query performance and reduce platform costs.
- Driving data governance initiatives — PII detection and classification, access control policies, data cataloging, lineage tracking, and data quality frameworks.
- Ensuring the availability, reliability, and cost efficiency of the data platform, including observability, monitoring, and alerting for pipeline and query engine health.
- Collaborating with ML, analytics, product, and engineering teams to define data contracts, maintain schema consistency, and provide clean, well-governed datasets.
- Contributing to disaster recovery strategy and multi-region reliability of the data platform.
You are the right future Veriffian for the job if you have:
- Strong experience with Python, SQL, and Apache Spark / PySpark for large-scale data processing.
- Deep knowledge of modern analytics platform architecture — object stores, columnar and row-based data formats (Parquet, Avro, ORC), orchestration tools, analytical query engines, schema registries, and data catalogs.
- Experience with data governance and data management at scale — PII handling, data cataloging, schema management, access control, and data quality frameworks.
- Experience designing and operating data lake and data warehouse infrastructure.
- Solid understanding of storage optimization — partitioning, compaction, and compression trade-offs.
- Experience building observability, monitoring, and alerting for data platforms.
- Strong problem-solving skills and comfort working with ambiguity — defining problems before solving them.
- A collaborative mindset — this role serves ML, analytics, and product teams as internal customers.
You're an especially awesome match if you have:
- Experience with Infrastructure as Code (IaC) and Terraform.
- Familiarity with containerization — Docker and Kubernetes.
- Experience with CI/CD pipelines for data platform deployments.
- Knowledge of data lake table formats beyond Iceberg (Delta Lake, Hudi).
- Familiarity with data catalog and metadata management tools (e.g., DataHub, Amundsen, AWS Glue Catalog).
- Understanding of data privacy regulations (GDPR) in the context of data engineering.
- Experience building streaming data pipelines.
- Experience with the AWS data stack.
- Flexibility to work from home
- Stock options that ensure your share in our success
- Extra recharge days on top of your annual vacation
- Comprehensive relocation support to Estonia or Spain
- Extensive medical, dental, and vision insurance to ensure you’re feeling great physically and mentally
- Learning and Development & Health and Sports budget that you are free to tailor to your own needs
- Four weeks of fully paid sabbatical leave after reaching your 5th work anniversary
Top Skills
What We Do
Veriff is the preferred identity verification partner for the world’s biggest and best digital companies, including pioneers in fintech, crypto, gaming and the mobility sectors. We provide advanced technology, deep insights and expertise from our foundation in digital-first Estonia and honed over decades in leading the digital identity revolution.The partner of choice for businesses who need to rapidly and effortlessly verify online users from anywhere in the world, Veriff delivers the widest possible identity document coverage. By supporting government issued IDs from more than 230 countries and territories and with our intelligent decision engine which analyzes thousands of technological and behavioral variables Veriff enables trust from the first hello. With offices in the United States, United Kingdom, Spain, and Estonia, and robust backing and funding from investors including Accel, Alkeon, IVP, Tiger Capital and Y Combinator, we’re dedicated to helping businesses and individuals build a safer and more secure world. To learn more, visit veriff.com.


.png)





