Sr. Data Engineer - Machine Learning & Innovation team
Data Engineers on the Disney Streaming Machine Learning and Innovation (MLI) team develop and maintain systems and datasets that are used for content recommendation and personalization on Disney Streaming’s suite of streaming video apps, notably Disney+. In this role you will partner with MLI’s Data Scientists and Machine Learning Engineers to help manage and scale processes to create algorithm input features and datasets. As a member of this team you will collaborate across Engineering and Data teams to identify internal and external data sources to design and implement ETL strategy, automation frameworks, and scalable data pipelines.
Responsibilities :
- Partner with technical and non-technical colleagues to understand algorithm feature and data requirements
- Work with Engineering teams to collect required data from internal and external systems
- Develop and maintain ETL routines using orchestration tools such as Airflow and Jenkins
- Collaborate with machine learning practitioners to design and build data forward solutions
- Deploy scalable streaming and batch data pipelines to support petabyte scale datasets
- Enforce common data design patterns to increase code maintainability
- Create ETL architecture designs and conduct reviews
- Perform ad hoc data analysis as necessary
- Partner with team leads to identify, design, and implement internal process improvements
- Drive and maintain a culture of quality, innovation, and experimentation
- Work in an Agile environment that focuses on collaboration and teamwork
Basic Qualifications :
- 5+ years of software experience with 3+ years of relevant data and software experience
- Experience in building large datasets and scalable services
- Experience deploying and running services in AWS and in engineering big-data solutions using technologies like Databricks, EMR, S3, and Spark
- Experience loading and querying cloud-hosted databases such as Redshift and Snowflake
- Experience designing and developing backend microservices for large scale distributed systems using gRPC or REST
- Experience with large-scale distributed data processing systems and cloud infrastructure such as AWS or GCP and container systems such as Docker or Kubernetes
Preferred Qualifications:
- Knowledge of the Python/Scala/Java data ecosystem
- Experience building streaming pipelines using Kafka, Spark, Flink, or Samza
- Excellent communication and people engagement skills
- Drive and maintain a culture of quality, innovation, and experimentation
- Mentor colleagues on best practices and technical concepts of building large scale solutions
Required Education :
- Bachelor’s degree in Computer Science or related field or equivalent work experience
Additional Information :
#DISNEYTECH