We are seeking a highly skilled Senior Data Engineer to design, build, and optimize scalable data pipelines, storage layers, and analytics infrastructure. The ideal candidate has strong data engineering foundations, deep experience with distributed data systems, and the ability to turn complex business requirements into high-performance, production-grade data solutions. The role also requires leadership in driving complex data projects end-to-end.
Key Responsibilities:
Leadership & Project Management
- Lead end-to-end technical delivery of data engineering projects.
- Define technical scope, timelines, and engineering milestones.
- Manage and coordinate multiple stakeholders (data scientists, ML engineers, backend teams, and product teams).
- Conduct architectural reviews, ensure engineering best practices, and drive technical decision-making.
- Mentor junior and mid-level engineers, fostering a culture of learning and excellence.
- Oversee code quality, documentation, and adherence to team standards.
- Proactively identify risks, dependencies, and areas for improvement across projects.
Data Engineering & Systems Development
- Design and implement scalable ETL/ELT pipelines for both batch and streaming workloads, and build/maintain data warehouses, lakes, and lakehouse architectures.
- Develop optimized data models (relational, dimensional, NoSQL) while ensuring data quality, lineage, governance, and metadata management.
- Write clean, modular, and well-tested production code in Python, Java, or Scala; build APIs, microservices, and integrations for data workflows; and implement CI/CD automation and testing.
- Build secure, scalable cloud-based data infrastructure on AWS/GCP/Azure using containerized, serverless, and distributed technologies, and manage IaC using Terraform, CloudFormation, etc.
- Work with modern data and streaming frameworks (Spark, Flink, Beam, Kafka, Kinesis, Pub/Sub) and cloud data platforms (Snowflake, BigQuery, Redshift, Databricks) to deliver real-time and event-driven pipelines.
- Collaborate with cross-functional teams (Data Scientists, ML Engineers, Software Engineers) and lead design discussions, code reviews, and mentoring to maintain engineering best practices.
Requirements
- 3+ years of experience in Data Engineering roles.
- Strong proficiency in Python, SQL, and at least one compiled language (Java/Scala/Go).
- Hands-on experience with data pipeline orchestration (Airflow, Dagster, Prefect, etc.).
- Deep understanding of distributed systems, data partitioning, fault tolerance, and performance tuning.
- Experience with containerized environments (Docker, Kubernetes).
- Strong problem-solving skills, debugging abilities, and systems-level thinking.
Preferred Qualifications:
- Experience supporting ML Ops workflows and feature/data pipelines for machine learning models and AI applications.
- Knowledge of Delta Lake and Data Warehouse.
- Familiarity with security, compliance, access controls, and enterprise data governance.
- Exposure to data observability tools (Monte Carlo, Databand, Soda, etc.).
- Background in AI, analytics, or large-scale product engineering is a plu
Top Skills
What We Do
We integrate global leaders in web development with passionate Asian talent to get a unique blend of Quality and Affordability.
We are headquartered in California and work consistent eastern and pacific standard hours.
We like ad hoc pairing as necessary, TDD, and working with other agencies to make things happen.
We contribute to open source projects and genuinely enjoy coding. We are also committed to teaching, and spreading knowledge!








