Senior ML Platform Engineer

Reposted 11 Days Ago
Be an Early Applicant
Toronto, ON
In-Office
Senior level
Information Technology • Industrial • Manufacturing
The Role
This role involves leading MLOps initiatives to automate Machine Learning workflows on GCP, collaborating with various stakeholders and ensuring efficient project delivery.
Summary Generated by Built In

Job Description:

Here at Rakuten Kobo Inc. we offer a casual working start-up environment and a group of friendly and talented individuals. Our employees rank us highly in terms of commitment to work/life balance. We realize that for our people to be innovative, creative and passionate they need to feel valued and supported. We believe in rewarding all our employees with competitive salaries, performance based annual bonuses, stock options
 

If you’re looking for a company that inspires passion, personal, and professional growth– join Kobo and come help us make reading lives better. 

The Role: Senior ML Platform Engineer (MLOps)

Rakuten Kobo Inc. is seeking a visionary and highly skilled Senior ML Platform Engineer to architect, build, and lead the evolution of our internal Machine Learning Platform and MLOps capabilities. In this pivotal role, you will define the strategic roadmap and hands-on implementation for a state-of-the-art, fully automated ML framework on the Google Cloud Platform (GCP).

You will be instrumental in designing and developing the core infrastructure, tools, and services that empower our Data Scientists and ML Engineers to efficiently develop, deploy, monitor, and manage their Machine Learning models throughout their lifecycle. Collaborating closely with Data Scientists, Data Engineers, Platform Engineers, and business stakeholders, you will transform manual ML production processes into a seamless, scalable, and reproducible ML Platform.

This groundbreaking position is dedicated to streamlining the entire ML project lifecycle by providing a robust, self-service platform, ensuring the continuous delivery of significant business value through innovative Machine Learning solutions. Success in this role demands not only profound ML engineering and platform-building expertise but also a strategic, forward-thinking mindset for seamlessly integrating ML/AI into the core of our engineering practices at scale.

Experience and Background:

  • 8+ years of professional experience in ML Engineering or related fields, with a significant portion dedicated to ML Platform development.
  • Proven experience leading the design, development, and implementation of a custom ML Platform or significant MLOps infrastructure for an organization. This is the most crucial must-have.
  • Deep expertise in MLOps tools and their integration into a platform, including:
    • Orchestration: Kubeflow, Airflow, Argo Workflows, Step Functions, Vertex AI Pipelines.
    • Experiment Tracking & Model Registry: MLflow, DVC, Vertex AI ML Metadata, SageMaker Experiments/Model Registry.
    • Model Monitoring & Observability: Prometheus, Grafana, Arize, Sagemaker Model Monitor, Vertex AI Model Monitoring.
    • Data/Model Versioning: DVC, Git-LFS, internal systems.
    • Feature Stores: Feast, Hops-works, or custom-built.
    • CI/CD for ML: Jenkins, GitHub Actions, GitLab CI, BuildKite, ArgoCD (GitOps).
    • Containerization & Orchestration: Docker, Kubernetes, Helm.
  • Strong proficiency in Python.
  • Extensive Cloud Experience, with a strong preference for GCP. This includes hands-on experience with GCP MLOps services (Vertex AI, Dataflow, BigQuery ML, Cloud Build, GKE, Cloud Composer).
  • Experience moving companies from manual to automated processes at scale, particularly in the context of ML development and deployment.
  • Demonstrated Seniority: Ability to lead projects, make architectural decisions, mentor junior engineers, and influence technical strategy. This includes communicating complex technical concepts to non-technical stakeholders.
  • Solid understanding of ML fundamentals (predictive modeling, deep learning, GenAI/LLMs are a plus but secondary to platform expertise).

The Skillset:

Strong hands-on experience with GCP tools such as:

  • Vertex AI
  • BigQuery
  • Cloud Storage
  • Cloud Composer / Airflow
  • Cloud Build and Cloud Deploy Cloud Functions

 

MLOps framework and Automation:

  • Strong understanding of data ingestion pipelines and experiment tracking tools.
  • Ability to enforce reproducibility and lineage tracking.
  • Familiarity with Kubeflow and/or TFX
  • Proven ability to design and implement CI/CD pipeline for ML (automated training, testing, and deployment, integration with GitHub or Cloud Build)
  • Experience with model versioning and registry (Vertex AI Model Registry)
  • Knowledge of Feature Store design.
  • Ability to setup automated monitoring for data and model drift, model performance.
  • Experience setting up observability stacks (logging, metrics, alerts, model health dashboards).

 

Software Engineering and DevOps:

  • Proficiency in Python (mandatory), familiarity with R/Scala/Java as needed.
  • Experience with containerization (docker) and orchestration (Kubernetes, GKE) Strong background in infrastructure-as-a-code (Terraform, Deployment Manager)
  • Ability to implement unit tests, integration tests, and ML-specific validation.

 

Compliance and Best Practices:

  • Knowledge of responsible AI practices (bias, explainability)
  • Familiarity with data governance, security and compliance standards.
  • Strong ability to document and enforce coding standards, review processes, and reproducibility guidelines.

 

Nice to have:

  • Familiarity with the eBook, audiobook, or publishing industry.
  • Contributions to open-source projects related to MLOps or autonomous systems.

The Perks: 

  • Flexible hours and working environment  
  • 4 extended summer long weekends 
  • Full benefits starting from your first day  
  • Paid Volunteer days, unlimited sick days, and 3% RRSP matching  
  • Monthly commuting allowance for hybrid employees  
  • Flexible health spending account  
  • Training budget + Udemy account  
  • Free Kobo device + free weekly e-book or audiobook  
  • Weekly Kobo Tech University sessions  
  • Maternity/paternity leave top up  
  • 90 Day Work from Anywhere program  
  • Daily lunch credit when in-office and in-office snacks  
  • Dog friendly office 
     

About Rakuten Kobo Inc. 

Owned by Tokyo-based Rakuten and headquartered in Toronto, Rakuten Kobo Inc. is one of the most advanced global ecommerce companies, with the world’s most innovative eReading services offering more than 6 million eBooks and audiobooks to 30 million + customers in 190 countries. Kobo delivers the best digital reading experience through creative innovation, award-winning eReaders, and top-ranking mobile apps. Kobo is a part of the Rakuten group of companies. 

Rakuten Kobo Inc. is an equal opportunity employer. Accessibility accommodations for candidates with disabilities participating in the selection process are available on request. Any information received related to accommodation needs of applicants will be addressed confidentially. 
 

Rakuten Kobo would like to thank all applicants for their interest in this role however only qualified candidates will be shortlisted. 

 #RKIND

Five Principles for Success
Our worldwide practices describe specific behaviors that make Rakuten unique and united across the world. We expect Rakuten employees to model these 5 Shugi Principles of Success.
Always improve, Always Advance - Only be satisfied with complete success - Kaizen
Passionately Professional - Take an uncompromising approach to your work and be determined to be the best
Hypothesize - Practice - Validate – Shikumika - Use the Rakuten Cycle to succeed in unknown territory
Maximize Customer Satisfaction - The greatest satisfaction for our teams is seeing their customers smile
Speed!! Speed!! Speed!! - Always be conscious of time - take charge, set clear goals, and engage your team

Rakuten is an equal opportunity employer. Accessibility accommodations for candidates with disabilities participating in the selection process are available on request. Any information received related to accommodation needs of applicants will be addressed confidentially. 

Rakuten would like to thank all applicants for their interest in this role however only qualified candidates will be shortlisted.

Beware of fraudulent job offers claiming to be from Rakuten. Rakuten does not send unsolicited job offers or request money during the recruitment process. Learn more: https://rakutenemploymentalert.com/

At the time of posting, Rakuten expects the Compensation (base salary + discretionary bonus) for this role to be within the range shown below. Individual compensation will vary based on job-related factors, including the skills, qualifications, and experience of the successful candidate as well as business need and geographic location. The successful applicant for this role will be eligible for stock options, health, vision, dental insurance, RRSP matching, Personal Time Off (PTO), Volunteer Time Off (VTO), and other employee benefits as the company implements.

CAD $127,008.00 - 177,008.00 annually

Top Skills

Airflow
BigQuery
Cloud Build
Cloud Composer
Cloud Deploy
Cloud Functions
Cloud Storage
Docker
Google Cloud Platform
Java
Kubernetes
Python
R
Scala
Terraform
Vertex Ai
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Toronto, Ontario
616 Employees
Year Founded: 2009

What We Do

Rakuten Kobo Inc. is the world’s only dedicated digital bookseller.

Owned by Tokyo-based Rakuten and headquartered in Toronto, Kobo enables more than 30 million readers worldwide to read anytime, anywhere and on any device.

With our award-winning eReaders and the free Kobo App for smartphones and tablets, Kobo is your portable reading world

Similar Jobs

Hybrid
2 Locations
243 Employees

Faire Logo Faire

Platform Engineer

eCommerce • Fintech • Machine Learning • Retail
Easy Apply
In-Office
3 Locations
1200 Employees
228K-369K Annually

Autodesk Logo Autodesk

Senior Machine Learning Operations Engineer - AI/ML Platform

Big Data • Cloud • Digital Media • Machine Learning • Mobile • Software • Industrial
In-Office
Toronto, ON, CAN
13285 Employees
107K-156K Annually

Similar Companies Hiring

Scrunch AI Thumbnail
Software • SEO • Marketing Tech • Information Technology • Artificial Intelligence
Salt Lake City, Utah
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
15 Employees
Fortune Brands Innovations Thumbnail
Manufacturing
Deerfield, IL
2450 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account