About Mozn
MOZN is a leading Enterprise AI company enabling organizations to make informed decisions in two critical domains: Financial Crime Prevention and Enterprise Knowledge Intelligence.
We’re a diverse, collaborative team of innovators united by a shared purpose: to build AI that delivers tangible business value, builds trust, and empowers people and organizations with augmented intelligence. Our culture is built on the relentless pursuit of excellence and meaningful impact.
If you’re passionate about working alongside exceptional talent on world-class AI, and you want the autonomy and runway to do the best work of your career, join us in shaping the future of intelligent enterprises.
About the role
We are looking for a highly motivated Cloud Platform Engineer III to join our Cloud Engineering team. The ideal candidate is passionate about data reliability, performance, and scalable backing services.
This role focuses on the reliability, performance, operation, automation, and continuous improvement of critical backing services such as MySQL, PostgreSQL, MongoDB, Elasticsearch/OpenSearch, Kafka, analytical databases (e.g., StarRocks, ClickHouse), and other database and messaging technologies across cloud-native and hybrid environments.
This is not a traditional DBA role. The ideal candidate understands distributed systems, Kubernetes, cloud platforms, automation, IaC, observability, and AI-assisted workflows, and can help product engineering teams use backing services safely and effectively.
What you'll do
Data Reliability & Backing Services Operations
- Own reliability, performance, scalability, and operational health of MySQL, PostgreSQL, MongoDB, Elasticsearch/OpenSearch, Kafka, StarRocks, ClickHouse, and similar platforms.
- Define best practices for how product engineering teams use transactional, document, search, messaging, and analytical platforms.
- Design and maintain highly available, scalable, and resilient platform services, including replication, backup, recovery, failover, and disaster recovery capabilities.
- Perform capacity planning, performance tuning, workload reviews, upgrades, patching, and lifecycle management for platform services.
- Identify and resolve risks such as slow queries, hot partitions, consumer lag, replication lag, index growth, retention issues, and storage saturation.
- Troubleshoot and resolve complex production issues related to databases, messaging systems, search platforms, and distributed data platforms.
Kubernetes & Cloud Platform Engineering
- Hands-on experience deploying, operating, and troubleshooting stateful workloads in Kubernetes-based environments.
- Strong understanding of Kubernetes fundamentals, including networking, storage, workload lifecycle management, scalability, and reliability concepts.
- Enable and support Kubernetes-based deployments of database, messaging, search, and analytical platforms using cloud-native patterns and operational best practices.
Automation & Platform Enablement
- Use automation, Infrastructure as Code, GitOps, and CI/CD to make backing services repeatable, reliable, and easier to operate.
- Contribute to self-service platform capabilities, guardrails, dashboards, alerts, runbooks, and production readiness checks.
- Use AI-assisted workflows where appropriate for incident triage, root cause analysis, query analysis, capacity forecasting, documentation, and developer support.
- Collaborate with Product Engineering, SRE, Security, Data Engineering, and Cloud Platform teams to improve reliability, performance, availability, and security posture.
Qualifications
- 4-7 years of experience in Platform Engineering, SRE, Database Reliability Engineering, Data Platform Engineering, DevOps, or related roles.
- Strong hands-on experience with MySQL, PostgreSQL, Kafka, and at least one of MongoDB or Elasticsearch/OpenSearch in production environments.
- Experience with analytical or distributed data platforms such as StarRocks, ClickHouse, Apache Doris, Druid, Pinot, or similar OLAP systems is highly desirable.
- Hands-on experience operating stateful workloads in Kubernetes-based environments.
- Good understanding of high availability, replication, backup and recovery, disaster recovery, capacity planning, and performance tuning concepts.
- Familiarity with distributed systems concepts including sharding, replication, partitioning, consistency, compaction, backpressure, consumer lag, and query optimization.
- Experience with at least one major cloud platform (AWS, GCP, or OCI).
- Experience automating provisioning, deployment, configuration, monitoring, and lifecycle management using tools such as Terraform, Helm, Ansible, GitOps, or similar automation frameworks.
- Strong scripting or programming skills in Python, Bash, Go, or similar.
- Experience with observability platforms such as LTGM, Prometheus/Grafana, ELK/OpenSearch, Datadog, or equivalent.
- Strong troubleshooting, problem-solving, and debugging skills across distributed systems.
- Excellent communication, collaboration, and documentation skills.
- Demonstrated curiosity, ownership mindset, adaptability, and ability to guide product engineering teams.
Preferred Qualifications
- Experience designing, operating, or optimizing large-scale distributed database, messaging, search, or analytics platforms.
- Experience operating or optimizing analytical databases and OLAP systems such as StarRocks, ClickHouse, Apache Doris, Druid, or Pinot.
- Experience with streaming and real-time data platforms leveraging technologies such as Kafka, Flink, Spark, CDC, or similar ecosystems.
- Exposure to Analytics, Data Engineering, AI/ML platforms, LLM-based applications, or AI infrastructure projects.
- Experience using AI-assisted tooling or agents to improve operations, troubleshooting, documentation, or developer self-service.
- Familiarity with modern data architectures and open table formats such as Apache Iceberg, Delta Lake, or Apache Hudi is a strong plus.
- Experience with GitOps practices, CI/CD pipelines, and modern platform engineering methodologies.
- Relevant certifications in Cloud Platforms, Kubernetes, Database Technologies, or Data Engineering are a plus.
Benefits
- You will be at the forefront of an exciting time for the Middle East, joining a high-growth rocket-ship in an exciting space
- You will be given a lot of responsibility and trust. We believe that the best results come when the people responsible for a function are given the freedom to do what they think is best
- The fundamentals will be taken care of: competitive compensation, top-tier health insurance, and an enabling culture so that you can focus on what you do best
- You will enjoy a fun and dynamic workplace working alongside some of the greatest minds in AI
- We believe strength lies in difference, embracing all for who they are and empowered to be the best version of themselves
Skills Required
- 4-7 years experience in Platform Engineering, SRE, Database Reliability Engineering, Data Platform Engineering, DevOps, or related roles
- Hands-on production experience with MySQL
- Hands-on production experience with PostgreSQL
- Hands-on production experience with Kafka
- Production experience with MongoDB or Elasticsearch/OpenSearch (at least one)
- Operate stateful workloads in Kubernetes-based environments
- Experience with at least one major cloud platform (AWS, GCP, or OCI)
- Automation and IaC experience (Terraform, Helm, Ansible, GitOps, or similar)
- Strong scripting or programming skills (Python, Bash, Go, or similar)
- Experience with observability platforms (LTGM, Prometheus/Grafana, ELK/OpenSearch, Datadog, or equivalent)
- Understanding of high availability, replication, backup/recovery, disaster recovery, capacity planning, and performance tuning
- Familiarity with distributed systems concepts (sharding, replication, partitioning, consistency, backpressure, consumer lag, compaction, query optimization)
- Strong troubleshooting, problem-solving, and debugging skills for distributed systems
- Excellent communication, collaboration, and documentation skills
- Experience with analytical/distributed OLAP systems (StarRocks, ClickHouse, Apache Doris, Druid, Pinot)
- Experience with streaming/real-time data platforms and ecosystems (Flink, Spark, CDC)
- Familiarity with modern data architectures and open table formats (Apache Iceberg, Delta Lake, Apache Hudi)
- Experience using AI-assisted tooling for operations or developer self-service
- Relevant certifications in Cloud Platforms, Kubernetes, Database Technologies, or Data Engineering
What We Do
Mozn is a Saudi technology company committed to advancing digital humanity through the harnessing of artificial intelligence to build enterprise AI-powered products – FOCAL, the end-to-end Risk and Compliance platform and OSOS, the leading Arabic Gen AI platform – along with tailored AI solutions designed to meet the unique needs of enterprises across various sectors.








