Data Engineer
Data Engineer
Rapid7 (Nasdaq: RPD) is advancing security with visibility, analytics, and automation delivered through our Insight cloud. Our solutions simplify the complex, allowing security teams to work more effectively with IT and development to reduce vulnerabilities, monitor for malicious behavior, investigate and shut down attacks, and automate routine tasks. Over 9,300 customers rely on Rapid7 technology, services, and research to improve security outcomes and securely advance their organization. For more information, visit our website, check out our blog, or follow us on LinkedIn.
The Opportunity
Rapid7 seeks a Data Engineer to build and maintain data infrastructure within Rapid7’s Data Engineering team's data platform. You will be responsible for deploying data pipelines and machine learning models in the cloud, implementing DevOps practices and developing data models within Snowflake. You will assist in architecting a new modern architecture working closely with cloud based tools such as Fivetran, Snowflake, and Airflow. The ability to conceptualize and create user-friendly self-service solutions is critical to be successful in this role. Our business is evolving quickly and we need you to think long term, but deliver incrementally.
The ideal candidate has hands-on experience performing Data Engineering and/or DevOps work in a cloud environment, and has worked closely with databases and data pipelines. It's critical that you are able to translate business objectives into data required to support key analyses. You will collaborate with a creative, analytical and data-driven team to bring a single source of truth and self-service analytics to the entire company.
In the role you will:
- Build and maintain the pipelines and applications that ingest, analyze and store Rapid7's enterprise data
- Lead the entire software lifecycle including hands-on development, code reviews, testing, deployment, and documentation for batch ETL's.
- Productionize data and machine learning pipelines with docker containerization and clustering tools (ECS/Kubernetes)
- Build an environment that enables data scientists to easily develop and productionize Python, R and Spark code on top of a Snowflake data warehouse
- Perform data engineering projects within Snowflake such as developing data pipelines, data models and metadata management solutions
- Collaborate with stakeholders in product, business and IT to deliver data products
- Work closely with leadership to drive adoption of the latest DevOps and DataOps trends and technologies.
- Partner with the IT, Infrastructure and engineering teams on integration efforts between systems that impact data & Analytics
In return you will bring:
- 1+ years of experience with a major cloud provider (preferably AWS) including hands on experience deploying code in cloud environments using tools such as Docker, Kubernetes, EC2, Terraform, etc
- More than 1 year of experience working with a modern cloud data warehouse (preferably Snowflake) and SQL
- More than 1 year of experience with orchestration tooling (preferably Airflow)
- Proficiency in one programming language such as Python, Java, Scala etc.
- Experience with a CI/CD tool such as Github Actions and AWS Code Pipeline
- Working knowledge of data architecture, data warehousing, and metadata management
- BS or MS in Computer Science, Analytics, Statistics, Informatics, Information Systems or another quantitative field. or equivalent experience and certifications will be considered