Lead Data Engineer

Posted 2 Days Ago
Hiring Remotely in United States
Remote
190K-230K Annually
Senior level
Security • Software
The Role
The Lead Data Engineer will design and implement cloud data lakes and large-scale data architectures. Responsibilities include collaboration with AI/ML teams, establishing data governance, and mentoring junior engineers. This role requires expertise in big data technologies, tabular data formats such as Apache Iceberg and Parquet, and proficiency in cloud platforms like AWS, Azure, or GCP.
Summary Generated by Built In

StrongDM is driven by a clear mission: Secure Access, Zero Trust.


We design products and solutions that reflect this commitment, transforming the way organizations manage privileged access across their critical infrastructure. By leading with Zero Trust Privileged Access Management (PAM), we help our customers achieve secure, dynamic, and fine-grained control over access to their most sensitive resources. This focus on security has earned us an industry-leading 98% customer retention rate.


Once a customer, forever a fan. That's our goal.


When you work at StrongDM, you join a team committed to solving today’s security challenges with technology that works and customers who trust us to protect their most critical assets.


If you ask anyone at StrongDM, you’ll find that our values truly guide everything we do—from how we innovate to how we treat each other. These values are the foundation of our culture and define who we are as a company. It may sound cliché, but we’re onto something great—and G2 agrees. 


We embrace the mission

We pursue mastery

We win together


These are the principles we embody as an organization. They influence how we work as individuals and teams, and what we look for in candidates who join us. We’re glad you’re here! If this sounds like an environment where you’d thrive, read on. 


We are seeking a highly skilled Principal Data Engineer with extensive experience in building cloud data lakes and architecting large-scale data platforms. You will be instrumental in designing and implementing data architectures that support diverse use cases, AI/ML to business intelligence (BI).


The ideal candidate will have deep expertise in tabular formats like Apache Iceberg, Apache Parquet, and other open standards. As a Lead Data Engineer, you will work closely with data scientists, AI teams, and business stakeholders to ensure that our data infrastructure is robust, scalable, and optimized for a variety of computational workloads. This role requires an innovative mindset and the ability to lead data engineering projects, making key architectural decisions that shape our data ecosystem.

What you'll do:

  • Design and Architect Cloud Data Lakes: Lead the design and development of scalable data lake architectures on cloud platforms (e.g., AWS, Azure, GCP), optimized for both structured and unstructured data.
  • Tabular Data Formats: Implement and manage tabular formats like Apache Iceberg, Parquet, and other open standards to efficiently store and process large datasets.
  • Data Platform Development: Architect and build large-scale, highly available data platforms that support real-time analytics, reporting, and AI workloads.
  • Compute Engines: Leverage various compute engines (e.g., Apache Spark, Dremio, Presto, Trino) to support complex business intelligence and AI use cases, optimizing performance and cost-efficiency.
  • Collaboration with AI Teams: Work closely with AI and machine learning teams to design data pipelines that enable AI model training, deployment, and real-time inference.
  • Data Governance: Establish best practices for data governance, ensuring data quality, security, and compliance with industry regulations.
  • Lead and Mentor: Provide technical leadership to data engineering teams and mentor junior engineers, fostering a culture of continuous learning and innovation.

Requirements:

  • Big Data Technologies: Strong knowledge of big data processing frameworks and data streaming technologies.
  • AI/ML Data Integration: Experience collaborating with AI/ML teams, building data pipelines that feed AI models, and ensuring data readiness for machine learning workflows.
  • Experience in Cloud Data Lakes: Proven experience in architecting and building data lakes on cloud platforms (AWS, Azure, GCP).
  • Open Standards Expertise: In-depth knowledge of Apache Iceberg, Apache Parquet, and other open standards for efficient data storage and query optimization.
  • Compute Engines: Expertise in using compute engines such as Apache Spark, Dremio, Presto, or similar, with hands-on experience in optimizing them for business intelligence and AI workloads.
  • Leadership: Proven track record of leading large-scale data engineering projects and mentoring teams.
  • Programming Languages: Proficiency in languages such as Python, Java, or Scala, and SQL for querying and managing large datasets.
  • AI/ML Workflows: Previous experience working directly with AI or machine learning teams preferred
  • Distributed Systems: A deep understanding of distributed systems and the challenges of scaling data infrastructure in large, dynamic environments preferred
  • Data Warehouse Experience: Familiarity with modern data warehousing solutions such as Snowflake or Redshift preferred

Compensation:

  • $190,000-$230,000 DOE + equity salary packages
  • Company-sponsored benefits, including:
  • Medical, dental, and vision insurance (free to employees and dependents)
  • 401K, HSA, FSA, short/long-term disability coverage, life insurance
  • 6 weeks of combined accrued vacation + sick time 
  • Volunteer days + standard holidays
  • 24 weeks paid parental leave for everyone + 1 month transition time back + childcare stipend for first year
  • Generous monthly and annual stipend for internet + home office

Top Skills

Java
Python
Scala
SQL
The Company
Burlingame, California
155 Employees
On-site Workplace
Year Founded: 2015

What We Do

Founded in 2015, we help companies big and small alike manage and audit access to their databases, servers, clusters, and web applications.

Why Work With Us

We started as and will remain, a remote work company. We offer great benefits, 401(k), and, well, the works. We respect your right to non-work time and recognize that sometimes, everyone just has to go for a walk. We’re well-funded, have amazing customers, and are here to stay. We love what we do, get it done, and treat each other with respect.

Gallery

Gallery

Similar Jobs

Capital One Logo Capital One

Lead Data Engineer, Data Mgmt Optimization - Shopping (Remote)

Fintech • Machine Learning • Payments • Software • Financial Services
Remote
Hybrid
McLean, VA, USA
55000 Employees
171K-195K Annually

CapTech Logo CapTech

Lead Data Engineer (AWS, Azure, GCP)

Information Technology • Consulting
Remote
Chicago, IL, USA
1100 Employees

CapTech Logo CapTech

Lead Data Engineer (AWS, Azure, GCP)

Information Technology • Consulting
Remote
Columbus, OH, USA
1100 Employees

FreeWheel Logo FreeWheel

Lead Software Engineer (Data)- GoLang- REMOTE

AdTech • Digital Media • Marketing Tech
Remote
Pennsylvania, USA
1249 Employees
112K-263K Annually

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account