HOW YOU'LL SPEND YOUR TIME
- Design and implement highly scalable, fault-tolerant data pipelines for real-time and batch processing.
- Develop and optimize end-to-end Databricks Spark pipelines for ingesting, processing, and transforming large volumes of structured and unstructured data.
- Build and manage ETL (Extract, Transform, Load) processes to integrate data from diverse sources into our data ecosystem.
- Implement data validation, governance, and quality assurance mechanisms to ensure accuracy, completeness, and reliability.
- Collaborate with data scientists, ML engineers, and analysts to integrate AI/ML models into production environments, ensuring efficient data pipelines for training, deployment, and monitoring.
- Work with real-time data streaming solutions such as Kafka, Kinesis, or Flink to process and analyze event-driven data.
- Improve and optimize performance, scalability, and efficiency of data workflows and storage solutions.
- Document technical designs, workflows, and best practices to facilitate knowledge sharing and maintain system documentation.
WHO YOU ARE
- 4+ years of experience as a professional software/data engineer, with a strong background in building large-scale distributed data processing systems.
- Experience with AI, machine learning, or data science concepts, including working on ML feature engineering, model training pipelines, or AI-driven data analytics.
- Hands-on experience with Apache Spark (Scala or Python) and Databricks.
- Experience with real-time data streaming technologies such as Kafka, Flink, Kinesis, or Dataflow.
- Proficiency in Java, Scala, or Python for building scalable data engineering solutions.
- Deep understanding of cloud-based architectures (AWS, GCP, or Azure) and experience with S3, Lambda, EMR, Glue, or Redshift.
- Experience in writing well-designed, testable, and scalable AI/ML data pipelines that can be efficiently reused and maintained with effective unit and integration testing.
- Strong understanding of data warehousing principles and best practices for optimizing large-scale ETL workflows.
- Experience with ML frameworks such as TensorFlow, PyTorch, or Scikit-learn.
- Optimize ML feature engineering and model training pipelines for scalability and efficiency.
- Knowledge of SQL and NoSQL databases for structured and unstructured data storage.
- Passion for collaborative development, continuous learning, and mentoring junior engineers.
WHAT CAN HELP YOU STAND OUT
- Exposure to MLOps or Feature Stores for managing machine learning model data.
- Experience with data governance, compliance, and security best practices.
- Experience working in a fast-paced startup environment.
WE'VE GOT YOU COVERED
- Every Teikametrics employee is eligible for company equity
- Remote Work – flexibility to work from home or from our offices + remote working allowance
- Broadband reimbursement
- Group Medical Insurance – Coverage of INR 7,50,000 per annum for a family
- Crèche benefit
- Training and development allowance
Similar Jobs
What We Do
Teikametrics helps sellers and brand owners grow their businesses on Amazon and Walmart.com through the combination of data, AI-powered technology, and marketplace expertise. Teikametrics Flywheel, the first Marketplace Optimization Platform, connects and optimizes critical ecommerce business operations including advertising, inventory, and market intelligence - all in one place.
Why Work With Us
At Teikametrics, we believe that our employees are our greatest asset. In this fast-paced startup environment, our team consistently puts our values into action, even while being remote. We are committed to the success of not only our clients but our people. Join our growing team if you’re ready for new challenges and some good fun along the way.
Gallery
_1.png)


.png)





