Senior AI Infrastructure & Platform Engineer - Riyadh,KSA

Posted 10 Days Ago
Be an Early Applicant
4 Locations
In-Office
Senior level
Artificial Intelligence • Computer Vision • Software
The Role
The Senior AI Infrastructure & Platform Engineer will build and optimize GPU-based AI infrastructures, manage deployments, and collaborate with data science teams to ensure efficient operations.
Summary Generated by Built In
Role Overview

We are seeking a highly skilled Senior AI Infrastructure & Platform Engineer to join our client’s team in Riyadh. In this role, you’ll be responsible for building, managing, and optimizing scalable AI infrastructure and compute environments that support high-performance workloads, including GPU-accelerated AI/ML pipelines, cluster scheduling, and orchestration.

Key Responsibilities
  • Deploy, maintain, and optimize GPU-based compute clusters and infrastructure.
  • Manage and operate GPU orchestration tools and platforms such as:
    • Nvidia Base Command Manager (critical)
    • Nvidia AI Enterprise Suite
    • Nvidia GPU and Network Operators
    • Nvidia NIMs and Blueprints
  • Configure, deploy, and maintain compute workloads using scheduling and orchestration tools including:
    • Slurm (critical)
    • Vanilla Kubernetes
  • Install, configure, and maintain the underlying OS (e.g. Canonical Ubuntu) and supporting system software.
  • Monitor and troubleshoot infrastructure performance, availability, and reliability; ensure high uptime for AI/ML workloads.
  • Work with data scientists, ML engineers, and dev teams to define infrastructure requirements, resource allocation, and deployment workflows.
  • Develop automation scripts, CI/CD pipelines, and best practices for infrastructure provisioning and management.
  • Document architecture, configurations, and operational procedures; enforce security, compliance, and backup policies.

RequirementsRequired Skills & Experience
  • Proven experience managing GPU-based AI/ML infrastructure and compute clusters.
  • Hands-on experience with:
    • Nvidia Base Command Manager
    • Nvidia AI Enterprise Suite
    • Nvidia GPU/Network Operators, NIMs, Blueprints
  • Strong experience with Slurm and/or Kubernetes orchestration.
  • Solid Linux system administration skills — preferably on Ubuntu or similar distributions.
  • Strong scripting/automation ability (e.g. Bash, Python, or relevant tooling) for provisioning, deployment, and maintenance.
  • Excellent troubleshooting and performance-tuning skills.
  • Experience collaborating with ML/data science teams and integrating infrastructure with their workflows.
  • Strong understanding of networking, security, resource allocation, and cluster management best practices.
Preferred Qualifications
  • Previous experience working in a high-performance computing (HPC) or AI-focused infrastructure team.
  • Knowledge of containerization, container orchestration, and GPUs in cloud or on-prem environments.
  • Experience with CI/CD, infrastructure-as-code (e.g. Terraform, Ansible), monitoring tools, and logging setups.
  • Familiarity with workload scheduling, job queuing, resource quotas, and GPU-shared environments.

Top Skills

Ansible
Bash
Kubernetes
Nvidia Ai Enterprise Suite
Nvidia Base Command Manager
Nvidia Gpu
Python
Slurm
Terraform
Ubuntu
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Berlin
10 Employees
Year Founded: 2020

What We Do

DeepSource stands as a trusted partner for businesses seeking cutting-edge AI services in computer vision, natural language processing, and predictive analytics. With a particular focus on Arabic NLP and ChatGPT bot development, DeepSource is dedicated to empowering companies with groundbreaking solutions that streamline operations, optimize workflows, and enhance user experiences. Our commitment to excellence is evident in our approach to addressing a wide range of AI needs, from hiring top talent and managing end-to-end AI projects to providing tailored consulting and comprehensive training programs. DeepSource's team of experts is equipped with extensive knowledge and experience in various AI technologies, which enables them to develop and deploy advanced solutions across multiple industries. Our adaptive strategies and innovative methodologies allow businesses to stay competitive in today's rapidly evolving digital landscape

Similar Jobs

Maqsam Logo Maqsam

Solution Specialist (Technical Sales)

Artificial Intelligence • Information Technology • Software
In-Office
2 Locations
113 Employees

Lawazem Logo Lawazem

Operations Director

eCommerce • Marketing Tech • Software
In-Office
3 Locations
64 Employees

Jeeny (jny.app) Logo Jeeny (jny.app)

Operations Specialist

eCommerce • Information Technology • Mobile • Software • App development
In-Office
Amman, JOR
653 Employees

INGOT Logo INGOT

Support Engineer

Payments • Software • Financial Services • Cryptocurrency
In-Office
Amman, JOR
316 Employees

Similar Companies Hiring

PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account