As a senior member of the ML data platform team, you will be responsible for architecting, developing, and operating the next generation of the Moveworks Machine Learning data platform. As Moveworks grows fast, the ML data platform team is tasked with designing and developing scalable, reliable, and secure data platforms, processing pipelines, and services, which powers Moveworks’ cutting edge NLP and Conversational AI technologies with the first class Enterprise data governance, security and privacy standards.
We’re building a team that indexes on moving fast, solving challenging product/engineering problems and providing value to our customers. To be successful, you'll be partnering with high caliber machine learning teams to identify their data needs and build elegant solutions. This is an opportunity to play an integral role at the fastest-growing AI startup in its space.
Who we are:
Moveworks is revolutionizing how companies support their employees — with the first AI platform that makes getting help at work effortless. Using advanced conversational AI built for the enterprise, Moveworks gives employees exactly what they need, from IT support to HR help to policy information. Our platform allows customers like DocuSign, Broadcom, and Western Digital to move forward on what matters.
Founded in 2016, Moveworks has raised $315 million in funding, at a valuation of $2.1 billion. We’ve been named to the Forbes AI 50 list for three consecutive years, while earning recognition as the Best Chatbot Solution at the 2021 AI Breakthrough Awards. Above all, we’ve built an AI company that puts people first, which is why both Inc. and the San Francisco Business Times called Moveworks one of the Best Workplaces of 2021.
Come join one of the fastest-growing teams on the planet!
What will you do?
- Closely work with machine learning teams to understand their data needs, influence data team’s roadmap, and lead as well as execute on various projects
- Design, build, and operate highly performant and scalable batch and stream data processing infrastructure and solutions to support day to day ML operations including training, serving, evaluation and experimental systems
- Design and develop Moveworks’ foundational data models, data warehouse, real-time and offline processing pipelines using Apache Kafka, AWS EMR Spark, AWS Athena, Snowflake, Airflow, etc.
- Build data lake and implement data cataloging platform for easy data discovery and availability
- Architect and implement the data anonymization and data access control frameworks that support policy based masking and access to data for different use cases
- Build out platform and data services/APIs to make data available to various different stakeholders and for customer facing data products
What do you bring to the table?
- 4+ years of experience as data platform engineer
- Strong coding and design expertise
- Hands on experience working with different teams for their data requirements, designing data models, and implementing ETL pipelines
- Familiarity with latest data streaming, query, processing and warehousing technologies, such as Kafka, Presto, Airflow, Spark, Hive, Postgres, HBase, etc., is required
- BS or higher in Computer Science or a related field, or equivalent relevant experience
- Experience with large scale Machine Learning System is a plus
- Experience with designing and building data validation and anonymization frameworks is a plus