Prevalent AI

Senior Data Engineer

Posted 9 Days Ago

Be an Early Applicant

Kakkanad, Ernakulam, Kerala

In-Office

Senior level

Artificial Intelligence • Information Technology • Software • Database • Analytics

The Role

Design, develop, and deploy SDS platform modules for large-scale security analytics. Build and optimize data ingestion, parsing, scheduling, processing, and data pipelines on cloud and on-prem. Ensure code quality, testing (unit/system/integration), and integrate big data tools and frameworks like Databricks and Spark. Research and adopt new big-data technologies; collaborate using Agile practices.

Summary Generated by Built In

Company Profile: -

Prevalent AI (PAI) is a Security Data Science Company, founded in the UK, by experts recognized globally, for solving the world’s toughest security problems. We apply the world’s best Security Data Science knowledge and expertise to help companies understand, deploy and support the most advanced security solutions, by developing a security architecture based on a deep understanding of Data Science, Security Tradecraft and Big Data Technologies.

PAI’s Security Data Science (SDS) platform is a big data security analytics platform that can ingest wide range of security telemetry data and apply advanced analytical approaches to identify and detect control weakness and security risks within enterprises.

PAI team consists of Cyber Security Domain Specialists, Information Security Analysts, Data Scientists, Data Engineers and Data Analysts focused on developing advanced security analytics solutions (Solution Development) and delivering security insights to our clients.

Prevalent AI India Pvt Ltd., a subsidiary of Prevalent AI, has offices in Infopark, Cochin, Kerala. For more information, please visit https://www.prevalent.ai

Role purpose and key Accountabilities: -

The primary role of a Senior Data Engineer is to support design and development of various data engineering components of Prevalent’s Security Data Science (SDS) Platform across data collection, ingestion, data processing and analytics. SDS is based on open-source big data technologies designed to ingest, process and analyse petabyte scale data in a distributed architecture.

The ideal candidate would be a self-motivated individual with strong technology skills, commitment to quality and positive work ethic, who can

As an Individual Contributor

Participate in design, development, and deployment of SDS platform modules on the Cloud and on premise.
Design & Develop data ingestion, data parsing, scheduling, processing, and other data management components. Focus being on Big Data Ingestion, Processing & Management
Define, develop and integrate unit, system & integration tests.
Guardian of Code Quality in terms of conciseness, maintainability, performance, security, dependency and open-source licensing
Research, Evaluate and integrate Big Data tools and frameworks

Personal Development

Develop a keen understanding of industry trends and best practices related to Data Engineering and Open- Source Big Data technologies.
Create and execute a personal education plan in the technology stack and solution architecture with your supervisor.

Skills and Experiences: -

Strong hands-on experience with Databricks and open-source big data technologies such as Spark and Scala, along with solid SQL expertise
Proven experience designing, building, and optimizing data pipelines and data architectures using orchestration tools like Airflow and stream-processing systems such as Spark Streaming
Experience working with cloud platforms (AWS or Azure), including services like EC2, EMR, RDS, S3, Lambda, and scalable big data storage systems
Hands-on exposure to message queuing, stream processing, and highly scalable distributed data platforms
Strong communication skills with proficiency in Scala and/or Python, and a solid understanding of the Spark framework and Agile methodologies.
Experience with API service creation
Exposure to Spark-based LLM integrations
Great verbal and written communication skills.
Strong project management and organizational skills.

Knowledge: -

Good knowledge of functional programming languages: Scala and/ or Python (PySpark)
Excellent understanding of Spark Framework.
Good understanding of Agile methodology.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores

Educations: -

Master’s/Bachelor’s in Computer Science Engineering

Values: -

Inquisitive: Inquisitive by nature, hungry for knowledge and excited about new challenges.

Authentic: Genuine, with a desire to create real and lasting relationships with clients, partners and colleagues

Focused: Tirelessly pursue results, driven by a clear sense of purpose to make clients and partners secure and successful.

Top Skills

Databricks,Spark,Scala,Sql,Airflow,Spark Streaming,Aws,Azure,Ec2,Emr,Rds,S3,Lambda,Python,Pyspark,Message Queuing,Stream Processing

View all jobs at Prevalent AI

View Prevalent AI Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: London

157 Employees

Year Founded: 2017

What We Do

Prevalent AI was founded to assemble the world’s best AI and Data Science talent, a team capable of building the security analytics of the future.

In a security technology landscape filled with rigid, siloed solutions and disparate data, organizations are unable to tackle threats and vulnerabilities effectively. By combining our Security Data Fabric with AI-powered Exposure Management, we provide our clients with complete clarity of their cyber risk.

Our Security Data Fabric automates the integration of complex and disparate data into a single unified knowledge graph, turning data chaos into data clarity with AI-powered entity resolution.

Our Exposure Management platform identifies every attack surface, contextualizes and prioritizes risk findings, and rapidly remediates exposures — so you’ll always stay one step ahead of attackers.