Data Engineer Intern / Data Science Intern [PFE]

Posted 4 Days Ago
Be an Early Applicant
Les Berges du Lac, Tunis
In-Office
Internship
Information Technology • Software
The Role
The internship involves designing and implementing ETL pipelines, managing cloud storage, and developing automated data solutions using various tools and technologies.
Summary Generated by Built In

About us 

At Cognira, we strongly believe that people are the biggest asset of our company. Our hand-picked team consists of passionate, collaborative, and forward-thinking individuals from all over the globe. We are passionate about making science easy and accessible to retailers, helping them get more value from people, data, and systems. We bring together expertise in retail, science, and scalable technologies to automate and enhance the quality of decision-making through software and consulting services.

We are proud to have a growing team of domain experts and data scientists, as well as a culture that fosters strong and long-lasting relationships with our clients. 

Our Locations

Atlanta, GA, U.S.

Paris, France

Istanbul, Turkey

Tunis, Tunisia 

Porto, Portugal

Our values:

Be nice and trustworthy

Put our customers first

Succeed together

Never stop improving

About this internship 
 We're looking for highly talented & motivated interns to join our Data team and nail one of the following projects: 

1- Data Engineering Department: 


1️⃣ Automated Data Pipeline Deployment & Optimization Framework using Databricks Bundles
Objective: Design and implement a modular ETL pipeline on Databricks and enable parameterized, YAML-driven deployments using Databricks Bundles. Implement Spark performance optimizations and CI/CD to promote pipelines across environments.
Key tasks: Ingest multi-source retail data → transform → write to Delta Lake; author Databricks YAML job definitions; benchmark Spark optimizations; integrate Bitbucket Pipelines or Jenkins for deployment and promotion.
Technologies (required / recommended): Databricks, Spark (PySpark), Delta Lake, Databricks Bundles (YAML), Python, Bitbucket Pipelines or Jenkins, Git, YAML.

2️⃣ Automated ETL Pipeline Deployment & Management Framework using Databricks REST API
Objective: Build a programmatic deployment and management layer for Databricks using the Databricks REST API to create/configure clusters, jobs, and notebooks dynamically and securely. Integrate CI/CD to automate deployments and run integration tests.
Key tasks: Implement REST-based job/cluster orchestration; parameterize environments via JSON configs; secure credential management; trigger and monitor jobs programmatically; integrate CI/CD.
Technologies (required / recommended): Databricks REST API, Python (requests/HTTP), Spark (PySpark), Delta Lake, JSON config management, Jenkins or Bitbucket Pipelines, Docker (for tooling), secure secrets handling (e.g., Azure Key Vault or Databricks secrets).

3️⃣ Data Collection API for Clients and File-Level Data Validation
Objective: Architect and implement a secure, scalable file-ingestion API that provides validation, auto-renaming, manifest generation, and reliable transfer to cloud storage (with full traceability).
Key tasks: Build file upload API endpoints; validate file integrity and metadata; copy validated files to Azure Blob Storage / ADLS and generate manifest entries; implement logging/auditing and Dockerize the service.
Technologies (required / recommended): Python (FastAPI or Flask), Docker, Azure Blob Storage / ADLS, Azure SDK for Python, JSON manifests, unit/integration testing frameworks, basic security/auth (API keys / OAuth / managed identity).

2- Data Science Department 

1️⃣ Tiny Time Mixers for Demand Forecasting (TSMixer / Time-Series LLMs)
Build a robust data pipeline to convert structured sales + promotion data into multivariate inputs for modern time-series backbones. Implement a forecasting model using TSMixer-style architectures (IBM/Google variants), and evaluate zero-shot, few-shot and full-shot performance versus traditional regression baselines.

 2️⃣ Advanced Modeling of Intermittent and Sparse Demand
Develop and benchmark models tailored for 'lumpy' products with long zero-demand intervals. Compare Croston variants and zero-inflated statistical approaches, then prototype sequential approaches (HMMs, Switching State-Space, and Deep State-Space Models using LSTMs/GRUs). Implement a two-part forecasting approach (event timing + non-zero magnitude) and propose metrics that reflect inventory and event-level performance.

3️⃣ LLM-Driven Feature Engineering for Knowledge-Augmented Retail Data
Design a Knowledge-Augmented Feature Pipeline (KAFP) to extract time-stamped, quantifiable exogenous features from unstructured sources (news, social media, competitor announcements, weather). Implement a Retrieval-Augmented Generation (RAG) pipeline and LLM prompt chains to generate signals like sentiment scores, trend indices, and competitive activity indices that can be joined into forecasting datasets.


You will be part of a high-growth software company. Our program is designed so interns can grow their skill sets, do meaningful work, and have a lot of fun along the way!

  • Over the course of the internship, you will be exposed to a wide range of Cognira’s tools, techniques, and technologies and have the opportunity to gain credible experience and learning
  • This internship will entirely be in-person for you to get an in-depth experience of the company's culture and be more involved throughout your tenure.
  • Duration: 4-6 months.

This is what we're looking for:

  • Excellent academics in Computer Science, Engineering, or related field
  • Problem-solving is your jam, and you're all about critical thinking.
  • You're not afraid to roll up your sleeves and get stuff done, even if you're independently on your own with minimal supervision.
  • You can juggle multiple projects like a pro.
  • Challenges don't scare you; in fact, you love diving into them.
  • You can communicate like a champ, whether it's writing reports or presenting in a room full of people.
  • You're curious, and you love picking up new skills & technologies.
  • You're a team player, always up for sharing your ideas and best practices.

It's not just an internship; we've got some great added value for you too. Here's what you'll enjoy:

  • Great company culture.
  • "Learn and Share" sessions.
  • You'll get support from your mentors.
  • Social events and after-work.
  • A flexible and fun work environment.
  • Casual dress code.
  • You'll work with a cool team! We respect your ideas, and we're all about trying new things.
  • Work/life balance

[ Important: Please send us your resume in English only ]

Top Skills

Azure Blob Storage
Azure Sdk For Python
Bitbucket Pipelines
Databricks
Databricks Bundles
Databricks Rest Api
Delta Lake
Docker
Fastapi
Flask
Git
Jenkins
JSON
Pyspark
Python
Spark
Yaml
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Atlanta, GA
95 Employees

What We Do

Transform your Retail Systems with AI Software Solutions: Intelligent Promotion, Accurate Forecasting, and Optimized Allocation & Assortment

Similar Jobs

Mondia Logo Mondia

Quality Assurance Engineer

Information Technology • Mobile • Software
In-Office
Tunis, TUN
300 Employees
In-Office
Tunisi, TUN
17787 Employees

Cognira Logo Cognira

Software Engineer

Information Technology • Software
In-Office
Les Berges du Lac, Tunis, TUN
95 Employees

Devoteam Logo Devoteam

Consultant

Consulting • Cybersecurity
In-Office
Tunis, TUN
9850 Employees

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account