KEY RESPONSIBILITIES:
- Design and Develop Data Pipelines:
- Architect, build, and deploy scalable and efficient data pipelines within our Big Data ecosystem using Apache Spark and Apache Airflow.
- Document new and existing pipelines and datasets to ensure clarity and maintainability.
- Data Architecture and Management:
- Demonstrate familiarity with data pipelines, data lakes, and modern data warehousing practices, including virtual data warehouses and push-down analytics.
- Design and implement distributed data processing solutions using technologies like Apache Spark and Hadoop.
- Programming and Scripting:
- Exhibit expert-level programming skills in Python, with the ability to write clean, efficient, and maintainable code.
- Cloud Infrastructure:
- Utilize cloud-based infrastructures (AWS/GCP) and their various services, including compute resources, databases, and data warehouses.
- Manage and optimize cloud-based data infrastructure, ensuring efficient data storage and retrieval.
- Workflow Orchestration:
- Develop and manage workflows using Apache Airflow for scheduling and orchestrating data processing jobs.
- Create and maintain Apache Airflow DAGs for workflow orchestration.
- Big Data Architecture:
- Possess strong knowledge of Big Data architecture, including cluster installation, configuration, monitoring, security, resource management, maintenance, and performance tuning.
- Innovation and Optimization:
- Create detailed designs and proof-of-concepts (POCs) to enable new workloads and technical capabilities on the platform.
- Collaborate with platform and infrastructure engineers to implement these capabilities in production.
KEY REQUIREMENTS:
- Minimum of 8 years hands-on experience with Big Data technologies e.g. Hadoop, Spark, Hive.
- Minimum 3+ years of experience on Spark
- Hands on experience with dataproc is a HUGE plus.
- Minimum 6 years of experience in Cloud environments, preferably GCP
- Any experience with NoSQL and Graph databases
- Hands on experience with managing solutions deployed in the Cloud, preferably on AWS
- Experience working in a Global company, working in a DevOps model is a plus
Top Skills
What We Do
For almost 200 years, Dun & Bradstreet has helped clients and partners grow and thrive through the power of data, analytics, and data-driven solutions. Our more than 6,000 employees around the world are dedicated to this unique purpose, and we are guided by important values that make us the established leader in business decisioning data and analytical insights. Our data & insights are valuable at all phases of a business lifecycle and whatever the economic environment.
Why Work With Us
We are at a transformational moment in our company journey, and we’re excited about it. Each day, we are taking steps to transform our culture into one that activates our people’s best work, exploring what needs to change to accelerate creativity and innovation, and challenging ourselves to think differently about how we interact.
Gallery
