Job Description:
Senior Data Scientist — ML & Semantic AI
Technologies: Azure · NLP · RAG · Semantic Matching · Python
Role SummaryWe are looking for a Data Scientist with expertise in Python, Azure Cloud, and NLP to build and enhance machine learning models at scale. The role includes embedding optimisation, semantic matching, LDA and RAG architectures, dense and sparse retrieval pipelines, and migration of cloud-native data pipelines to Azure Databricks.
Core Requirements- Design and execute end-to-end machine learning pipelines including data extraction, preprocessing, feature engineering, model development, tuning, and deployment.
- Develop machine learning pipelines using Azure Synapse, Databricks, and Snowflake.
- Build and deploy classification, regression, and clustering models.
- Develop and deploy proof-of-concept solutions for client use cases.
- Implement semantic matching and similarity search using cosine similarity, dot-product scoring, and bi-encoder/cross-encoder architectures (e.g., SBERT, sentence-transformers).
- Build embedding models by fine-tuning pre-trained models and optimising embedding storage in vector databases such as Chroma DB, FAISS, and Azure AI Search.
- Train and optimise models for new data providers with dynamic input handling.
- Improve LDA model performance for large-scale topic modelling.
- Implement hybrid semantic search by combining dense and sparse retrieval methods.
- Optimise RAG architectures and retrieval QA systems for chatbot and recommendation performance.
- Enable semantic query understanding using intent classification and query expansion techniques.
- Develop forecasting models for marketing, demand prediction, and trend analysis.
- Apply NLP-based forecasting techniques using sentiment and external data.
- Use semantic similarity for audience intelligence, including zero-shot and few-shot classification techniques.
- Migrate data pipelines from Azure Synapse to Azure Databricks and retrain models accordingly.
- Optimise embedding storage and retrieval within Azure AI Search.
- Perform vector index tuning including HNSW optimisation and ANN benchmarking for production systems.
Python, Azure Databricks, Azure ML, Azure Synapse, Azure Blob Storage, Scikit-learn, NumPy, Pandas, Hugging Face, sentence-transformers, FAISS, Chroma DB, Azure AI Search, LangChain, TensorFlow, PyTorch, Statsmodels, Azure OpenAI.
Location:
DGS India - Mumbai - Thane Ashar IT ParkBrand:
MerkleTime Type:
Full timeContract Type:
PermanentSkills Required
- Proficiency in Python
- Experience with Azure Databricks
- Experience with Azure ML
- Experience with Azure Synapse
- Experience with Azure Blob Storage
- Experience with Snowflake
- Design and execute end-to-end ML pipelines (data extraction, preprocessing, feature engineering, model development, tuning, deployment)
- Experience building classification, regression, and clustering models
- Expertise in NLP, semantic matching, similarity search (cosine, dot-product), bi-encoder/cross-encoder architectures
- Experience with embedding model fine-tuning and storage in vector DBs (Chroma DB, FAISS, Azure AI Search)
- Experience implementing RAG architectures and retrieval-augmented QA systems
- Familiarity with sentence-transformers / SBERT and Hugging Face ecosystem
- Experience with LangChain
- Experience with TensorFlow and PyTorch
- Experience with Scikit-learn, NumPy, Pandas, Statsmodels
- Experience migrating data pipelines to Azure Databricks and retraining models
- Experience with vector index tuning (HNSW) and ANN benchmarking for production
- Experience improving LDA topic modelling at scale
- Experience building forecasting models using NLP signals and external data
- Familiarity with Azure OpenAI
What We Do
Dentsu Creative is a global creative agency network designed to unlock exponential growth for clients. We use Transformative Creativity as a differentiating, driving force to bring our capabilities together to positively impact people, business and society. Established in 2022, Dentsu Creative is integrated with dentsu’s Media and CXM businesses in over 145 countries and regions, to offer Integrated Growth Solutions.








