Pipl is an AI company that helps global enterprises make better fraud decisions. We're built on more than 20 years of identity data and Elephant, the industry's only large risk model trained on payment fraud.
Pipl's three products give enterprise teams modular access to the intelligence they need across payment and transaction ecosystems, where the cost of getting it wrong is highest.
Pipl Trust brings AI-native risk decisioning into payment workflows, with Elephant resolving identity across behavioral, device, and network signals in real time. Pipl Search connects identity intelligence for investigations and background checks, drawing on our global identity graph. Pipl Elements delivers phone and email signals that strengthen existing fraud models and verification workflows.
Our identity graph covers more than 5 billion identities and 740 billion signals. The world's largest payment networks, ecommerce marketplaces, and digital wallet platforms trust Pipl to get it right.
We’re looking for a Data Engineer to join our growing data team and help design, build, and scale our data infrastructure. Our team works closely with product, engineering, and data science to ensure reliable, high-quality data pipelines that power analytics, machine learning models, and data-driven decision-making.
As a Data Engineer, you will be responsible for creating and maintaining systems that collect, process, and store vast amounts of data, ensuring it is accessible, reliable, and optimized for performance across the organization.
Responsibilities
- Design, build, and maintain scalable ETL pipelines from multiple sources.
- Work closely with product managers, data scientists, and analysts to ensure data solutions meet business and technical needs.
- Ensure data integrity, accuracy, and security across platforms, implementing monitoring and validation frameworks.
- Improve data pipeline efficiency and performance, ensuring low latency and cost-effective solutions.
- Recommend and implement new technologies, tools, and best practices for data engineering.
- Leverage AI tools and assistants to improve productivity, code quality, and pipeline development, and share best practices with the team.
- 3+ years of experience as a Data Engineer (or in a similar role).
- Passionate about AI and actively uses AI tools to accelerate day-to-day work, from writing and debugging code to data exploration, documentation, and problem-solving.
- Strong programming skills in Python.
- Hands-on experience with Cloud environment.
- Experience working with dbt for data transformations and modeling (an advantage).
- Solid experience with SQL and database design (both relational and NoSQL).
- Proven track record in building and maintaining large-scale data pipelines using frameworks such as Spark, Airflow, Kafka, or similar.
- Strong understanding of data modeling, warehousing, and ETL best practices.
- Self-motivated, detail-oriented, and able to work autonomously.
- Excellent communication skills in English.
Advantages
- Experience with containerization and orchestration (Docker, Kubernetes).
- Programming skills in Java and Scala.
- Familiarity with real-time data processing systems.
- Exposure to data security and compliance best practices.
- Prior experience working in a big data or search engine environment.
- Hands-on experience with Aerospike.
Skills Required
- 3+ years of experience as a Data Engineer (or similar role)
- Passionate about AI and actively uses AI tools to accelerate day-to-day work
- Strong programming skills in Python
- Hands-on experience with Cloud environment
- Experience working with dbt for data transformations and modeling
- Solid experience with SQL and database design (both relational and NoSQL)
- Proven track record in building and maintaining large-scale data pipelines using frameworks such as Spark, Airflow, Kafka, or similar
- Strong understanding of data modeling, warehousing, and ETL best practices
- Self-motivated, detail-oriented, and able to work autonomously
- Excellent communication skills in English
- Experience with containerization and orchestration (Docker, Kubernetes)
- Programming skills in Java and Scala
- Familiarity with real-time data processing systems
- Exposure to data security and compliance best practices
- Prior experience working in a big data or search engine environment
- Hands-on experience with Aerospike
What We Do
Pipl is the identity trust company that makes sure no one pretends to be you. We do this by understanding the deep connections between the data elements that make up an identity and looking at the big picture. We analyze the relationships of many identifiers such as email, mobile-phone and social-media data that spans the globe. Our identity resolution engine continuously collects, cross references and connects identity records to create data clusters across the internet and numerous exclusive sources. The result is a searchable index of more than 3.5 billion identity profiles comprising over 3.6 billion phone numbers and 1.7 billion email addresses, with coverage in more than 150 countries. Our API and manual review solutions allow merchants to provide frictionless customer experiences and approve more transactions while reducing chargebacks and the risk of fraud.


.png)





