Rapid7

Principal Software Engineer - Data Platform

Posted 3 Days Ago

Be an Early Applicant

Belfast, County Antrim, Northern Ireland

Hybrid

Senior level

Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity

At Rapid7, we are on a mission to create a secure digital world for our customers, our industry, and our communities.

The Role

The Principal Software Engineer will architect and scale data pipelines, design CI/CD infrastructure, integrate technology like Kafka and Spark, and ensure high availability and compliance for the data platform. Responsibilities include mentoring junior engineers and collaborating with cross-functional teams to develop efficient data solutions.

Summary Generated by Built In

As a Principal Engineer, you'll get the opportunity to be a hands-on engineer, learning best practice engineering processes and approaches whilst receiving ongoing development through coaching, mentoring and pairing with other engineers on your team. From problem-solving to challenging old ways of thinking, you will have the opportunity to unleash your full potential and creativity whilst working with cutting edge technologies in a dynamic and collaborative team.
About the Team
The Data Platform team is responsible for building ETL Pipelines that fuel the Data Platform at Rapid7. Moving Product Data into our Data Platform for product teams to develop new features, enhance existing features and build shared experiences to create value for customers across the world.
We have a cutting edge data stack including Kafka, K8s, Spark and Iceberg.
About the Role
The Principal Engineer role is a part of our Data Platform Engineering team. In this role you will be focussed on helping our product teams move data into our Data Platform for in product experiences and product analytics.
As a Principal Engineer on the Data Platform Engineering team, you will be responsible for architecting and scaling streaming and batch data pipelines, while also designing the CI/CD infrastructure that ensures efficient development and deployment of data services. You will play a key role in shaping the architecture of our data platform, collaborating with cross-functional teams to deliver highly available, performant, and scalable solutions for both real-time and large-scale data processing.
In this role, you will:

Architect and implement a highly scalable Data Platform that supports Change Data Capture (CDC) using Debezium and Kafka for data replication across different databases and services.
Design and maintain large-scale data lakes using Apache Iceberg, ensuring efficient data partitioning, versioning, and schema evolution to support real-time analytics and historical data access.
Build and optimize CI/CD pipelines for the deployment and automation of data platform services using tools like Jenkins.
Lead the integration of Apache Spark for large-scale data processing and ensure that both batch and streaming workloads are handled efficiently.
Collaborate with our Platform Delivery teams to ensure high availability and performance of the data platform, implementing monitoring, disaster recovery, and automated testing frameworks.
Provide technical leadership and mentoring to junior engineers, promoting best practices in CDC architecture, distributed systems, and CI/CD automation.
Ensure that the platform adheres to data governance principles, including data lineage tracking, auditing, and compliance with regulatory requirements.
Stay informed about the latest advancements in CDC, data engineering, and infrastructure automation to guide future platform improvements.
Work closely with product and data science teams to understand business requirements and translate them into scalable and efficient data platform solutions.
Stay current with the latest trends in data engineering and infrastructure, making recommendations for improvements and introducing new technologies as appropriate.

The skills you'll bring include:

10+ years of experience in software engineering with a focus on data platform engineering, data infrastructure, or distributed systems.
Expertise in building data pipelines using Apache Kafka or similar for ingesting, processing, and distributing high-throughput data.
Strong experience designing and managing CI/CD pipelines for data platform services using tools such as Jenkins.
Experience with Apache Iceberg (or similar Delta Lake/Apache Hudi) for managing versioned, partitioned datasets in data lakes with an understanding of Apache Spark for both batch and streaming data processing, including optimization strategies for distributed data workloads.
Expertise in designing distributed systems and managing high-throughput, fault-tolerant, and low-latency data architectures.
Strong programming skills in Java, Scala, or Python.
Experience with cloud-based environments (AWS, GCP, Azure) and containerized infrastructure using Kubernetes and Docker.
The attitude and ability to thrive in a high-growth, evolving environment
Collaborative team player who has the ability to partner with others and drive toward solutions
Strong creative problem solving skills
Solid communicator with excellent written and verbal communications skills both within the team and cross functionally
Passionate about delighting customers, puts the customer needs at the forefront of all decision making
Excellent attention to detail

We know that the best ideas and solutions come from multi-dimensional teams. That's because these teams reflect a variety of backgrounds and professional experiences. If you are excited about this role and feel your experience can make an impact, please don't be shy - apply today.
#LI_FB1

Top Skills

Kafka

Spark

What the Team is Saying

View all jobs at Rapid7

View Rapid7 Profile

Report Job

The Company

HQ: Boston, MA

2,400 Employees

Hybrid Workplace

Year Founded: 2000

What We Do

We do this by embracing tenacity, passion, and collaboration to challenge what’s possible and drive extraordinary impact.

Here, we’re building a dynamic workplace where everyone can have the career experience of a lifetime. We challenge ourselves to grow to our full potential. We learn from our missteps and celebrate our victories. We come to work every day to push boundaries in cybersecurity and keep our 11,000+ global customers ahead of whatever’s next.

Why Work With Us

What makes us unique is how we embrace, model, and celebrate our core values. By challenging convention, being an advocate, creating impact together, always bringing our full selves, and recognizing that our work is never done, we are able to make an extraordinary impact on our business, our industry, and our own career growth.

Gallery

Rapid7 Offices

Learn More

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Our default working model is hybrid, with employees working three days per week in the office. This approach underpins our commitment to flexibility and adaptability while supporting our dedication to development, teamwork and customer purpose.

Typical time on-site: 3 days a week