A Typical Day:
- Troubleshoot, resolve and suggest deep code-level analysis of Apache Hudi with Spark and Flink to address complex issues raised by OpenSource user/Onehouse customer
- Provide best practices guidance around Apache Hudi runtime performance and usage of Apache Hudi core libraries and APIs for custom-built solutions
- Help the Engineering/Product Team with detailed troubleshooting guides and runbook
- Contribute to automation and tooling programs to make daily troubleshooting efficient.
- Work with the Data Engineers and Engineering Team and spread awareness of upcoming features and releases.
- Identify Apache Hudi bugs /contribute possible workarounds
- Evangelize Apache Hudi and expand its ecosystem by open source contributions to Hudi integrations with other open source projects in this space.
- Demonstrate ownership and coordinate with engineering and escalation teams to achieve resolution of customer issues and requests
- Participate in weekly on call rotation to help the Open source community.Deliver advanced technical and operational support for open-source users and customers.
- Triage, escalate, and resolve issues with a sense of urgency and commitment to and deliver advanced technical and operational support for open-source users and customers.
What You Bring to the Table:
- 3+ years of experience working directly with customers on technical engagements.
- 3+ years of experience with major cloud providers like AWS, GCP or Azure.
- Strong verbal and writing communication skills with a track record of simplifying technical explanations.
- 3+ years of deep experience with Apache Spark and knowledge on Spark’s runtime internals, query tuning, performance tuning, troubleshooting and debugging Spark solutions.
- Hand-on technical experience building, operating, and troubleshooting data engineering pipelines.
- Working knowledge of a few modern data tools and frameworks like Apache Spark, Kafka, Flink, Presto/Trino, Hive, DBT, Airflow, Parquet, Avro, ORC.
- Familiarity with some cloud data services like EMR, Redshift, Kinesis, Glue, Dataproc, BigQuery, Databricks, HDInsight, Synapse.
- Ability to collaborate with multiple stakeholders internally and externally.
- Experience with modern data lakehouse technologies like Apache Hudi, Iceberg or Delta Lake.
- Experience in contributing to an open source project.
Similar Jobs
What We Do
Onehouse delivers a universal data lakehouse through a cloud-native managed lakehouse service built on Apache Hudi, which was created by the founding team while they were at Uber. Onehouse makes it possible to blend the ease of use of a warehouse with the scale of a data lake, by offering a seamless experience for engineers to get their data lakes up and running. Onehouse offers the widest interoperability for your data in the market across table formats, multiple compute engines and multiple cloud providers.
We have a stellar team of inspired, seasoned professionals including data, distributed systems, and platform engineers from Uber, LinkedIn, Confluent, and Amazon. Our product team has helped build enterprise data products at major enterprises including Azure Databricks.







