The Role
The Senior Data Engineer will design and implement complex data pipelines for batch and real-time processing, lead streaming data platform architecture, and collaborate with data teams. Responsibilities include mentoring junior engineers, optimizing data processes, and ensuring data reliability.
Summary Generated by Built In
Role Summary: We are seeking a highly skilled Senior Data Engineer to join our team. This role requires deep technical expertise in real-time data streaming, distributed computing, and big data technologies. The successful candidate will have a proven track record of designing and implementing scalable, high-performance data pipelines. As a Senior Data Engineer, you will guide the technical direction of our data infrastructure, mentor our data teams, and ensure robust and scalable data solutions.
Responsibilities:
- Design and implement complex data pipelines for both batch and real-time processing.
- Lead the architectural design of streaming data platforms, ensuring scalability, performance, and data reliability.
- Collaborate with data scientists, analysts, and business stakeholders to gather requirements and translate them into technical specifications.
- Develop and maintain high-quality, maintainable data processing solutions using modern streaming technologies.
- Oversee the development and maintenance of data lakes, data warehouses, and streaming platforms.
- Mentor junior data engineers, fostering a culture of continuous learning and improvement.
- Conduct code reviews to ensure adherence to best practices and data engineering standards.
- Stay abreast of emerging big data technologies and industry trends.
- Optimize data pipelines for maximum throughput and minimal latency.
- Ensure data quality, consistency, and reliability across all data platforms.
- Troubleshoot and debug complex issues in distributed systems.
Must-have Qualifications:
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- 10+ years of experience in data engineering.
- Strong expertise in Apache Spark and Spark Streaming for large-scale data processing.
- Extensive experience with Apache Kafka for real-time data streaming and event processing.
- Proficiency in building and maintaining real-time analytics platforms, particularly with Apache Druid.
- Strong programming skills in Python, Scala, or Java.
- Deep understanding of distributed systems and big data architectures.
- Extensive experience with both batch and stream processing paradigms.
- Strong knowledge of data modeling and optimization techniques.
- Experience with major cloud platforms (AWS, Azure, GCP) and their data services.
- Excellent problem-solving skills and meticulous attention to detail.
- Strong communication and collaborative skills.
Nice-to-have Qualifications:
- Master's degree in Computer Science, Information Technology, or a related field.
- Experience with additional streaming technologies like Apache Flink and Apache Storm.
- Knowledge of data governance and data security best practices.
- Experience deploying real-time machine learning models.
- Familiarity with modern data lake technologies like Delta Lake and Apache Iceberg.
- Experience with NoSQL databases like Cassandra and MongoDB.
- Proficiency in data visualization tools such as Grafana and Kibana.
- Experience with infrastructure-as-code and CI/CD for data pipelines.
- Certifications in relevant cloud platforms or big data technologies.
Top Skills
Java
Python
Scala
The Company
What We Do
Empowering the world through timely, trusted and actionable data through enhanced optical spectroscopy