Senior Data Engineer

Reposted 4 Days Ago
Be an Early Applicant
2 Locations
Remote
Senior level
Information Technology • Software • Cybersecurity
The Role
As a Senior Data Engineer, you will architect security data ecosystems by designing data lakehouse architectures, implementing real-time streaming pipelines, and enabling AI/ML features. You will manage data ingestion patterns and ensure system integrity through automation and observability.
Summary Generated by Built In

As a Senior Data Engineer, you will be the architect of our security data ecosystem. Your primary mission is to design and build high-performance data lake architectures and real-time streaming pipelines that serve as the foundation for COGNNA's Agentic AI initiatives. You will ensure that our AI models have access to fresh, high-quality security telemetry through sophisticated ingestion patterns.

Key Responsibilities

1. Data Lake & Storage Architecture

  • Architectural Design: Design and implement multi-tier Data Lakehouse architectures to support both structured security logs and unstructured AI training data.
  • Storage Optimization: Define lifecycle management, partitioning, and clustering strategies to ensure high-performance querying while optimizing for cloud storage costs.
  • Schema Evolution: Manage complex schema evolution for security telemetry, ensuring compatibility with downstream AI/ML feature engineering.

2. Real-Time & Streaming Processing

  • Streaming Ingestion: Build and manage low-latency, high-throughput ingestion pipelines capable of processing millions of security events per second in real-time.
  • Unified Processing: Design unified batch and stream processing architectures to ensure consistency across historical analysis and real-time threat detection.
  • Event-Driven Workflows: Implement event-driven patterns to trigger AI agent reasoning based on incoming live data streams.

3. AI/ML Enablement & Feature Engineering

  • Vector Data Foundations: Architect the data infrastructure required to support semantic search applications and variants of RAG architectures for our generative AI models.
  • Feature Management: Design and maintain a centralized repository for ML features, ensuring consistent data is used for both model training and real-time inference.
  • AI Pipeline Orchestration: Build automated workflows to handle data preparation, model evaluation, and deployment within our cloud AI ecosystem.

4. DataOps & Systems Design

  • Infrastructure as Code: Utilize declarative tools (e.g., Terraform) to manage the entire lifecycle of our cloud data resources and AI endpoints.
  • Quality & Observability: Implement automated data quality frameworks and real-time monitoring to detect "data drift" or pipeline failures before they impact AI model performance.

Requirements
  • Experience & Education: 5+ years in Data Engineering or Backend Engineering, focused on large-scale distributed systems. B.S. or M.S. in Computer Science or a related technical field.
  • Cloud Architecture: Deep architectural mastery of the Google Cloud Platform ecosystem, specifically regarding managed analytical warehouses, serverless compute, and identity/access management. Proven track record of deploying enterprise-scale Data Lakehouses from scratch.
  • Real-Time Mastery: Expertise in building production-grade distributed messaging and stream processing engines (e.g., managed Apache Beam/Flink environments) capable of handling high-velocity telemetry.
  • AI Enablement: Strong understanding of how data architecture impacts AI performance. Experience building embedding pipelines, feature stores, and automated workflows for model training and evaluation.
  • Software Fundamentals: Expert-level Python and advanced SQL. Proficiency in high-performance languages like Go or Scala is highly desirable.
  • Operational Excellence: Advanced knowledge of CI/CD, containerization on Kubernetes, and managing cloud infrastructure through code to ensure reproducible environments.
Preferred Qualifications
  • Experience with dbt for modern analytics engineering.
  • Understanding of cybersecurity data standards (OCSF/ECS).
  • Previous experience in an AI-first startup or a high-growth security tech company.

Benefits

💰 Competitive Package – Salary + equity options + performance incentives
🧘 Flexible & Remote – Work from anywhere with an outcomes-first culture
🤝 Team of Experts – Work with designers, engineers, and security pros solving real-world problems
🚀 Growth-Focused – Your ideas ship, your voice counts, your growth matters
🌍 Global Impact – Build products that protect critical systems and data

Top Skills

Apache Beam
Apache Flink
Dbt
Go
Google Cloud Platform
Kubernetes
Python
Scala
SQL
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
50 Employees
Year Founded: 2022

What We Do

Detect the Undetectable. Defeat the Unpredictable.

Similar Jobs

Yassir Logo Yassir

Senior Data Engineer

Information Technology • Mobile • Consulting
Remote or Hybrid
8 Locations
1213 Employees
20K-200K Annually
In-Office or Remote
2 Locations
8 Employees

Pfizer Logo Pfizer

LOM - International Manager

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Remote or Hybrid
Cairo, EGY
121990 Employees

Dandy Logo Dandy

Recruiter

Computer Vision • Healthtech • Information Technology • Logistics • Machine Learning • Software • Manufacturing
Remote
Cairo, EGY
1800 Employees

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account