What You’ll Be Doing:
- Cloud Platform Architecture & Operations
- Plan, deploy, monitor, and maintain AWS services (EC2, S3, VPC, Lambda, EKS, etc.) and Alibaba Cloud services (ECS, OSS, VPC, Function Compute, ACK, etc.).
- Design highly available, auto-scaling cloud architectures, optimizing network (e.g., Alibaba Cloud CEN, AWS Direct Connect), storage, and compute resource configurations.
- Monitoring & Incident Management
- Implement full-stack monitoring and alerting using cloud-native tools (AWS CloudWatch, Alibaba Cloud CloudMonitor) and open-source solutions (Prometheus+Grafana, ELK).
- Lead critical incident response, perform root cause analysis, and implement preventive measures (e.g., resource contention, misconfigurations, network latency).
- Cost Optimization & Resource Management
- Analyze cloud resource usage, reduce costs via reserved instances, auto-scaling, and storage lifecycle policies (e.g., AWS S3 Intelligent-Tiering, Alibaba Cloud OSS Archive).
- Establish resource quota management strategies to prevent waste and overspending.
- Security & Compliance
- Implement cloud security baselines (security groups, IAM policies, Alibaba Cloud RAM permissions, AWS Security Hub), conduct regular security audits, and remediate vulnerabilities.
- Design granular access controls using AWS IAM and Alibaba Cloud RAM, and enforce database auditing (e.g., AWS CloudTrail + Alibaba Cloud DAS).
- Cross-Team Collaboration & Knowledge Sharing
- Collaborate with development teams to optimize application architectures and provide cloud-native solutions (Serverless, microservices).
- Document operational procedures (SOP manuals) and lead internal technical training sessions.
What We Look For In You:
- Technical Skills
- Mastery of core services (compute/storage/network/security) on AWS or Alibaba Cloud, with familiarity in the other platform.
- Proficient in Linux/Windows system operations and automation tools (Shell/Python/Ansible).
- Hands-on experience with containerized operations (Kubernetes, ECS/EKS, ACK) and cloud-native technologies (e.g., Service Mesh).
- Experience Requirements
- 5+ years of operations experience, with at least 3 years focused on public cloud (AWS/Alibaba Cloud) environments managing 100+ instances.
- Experience in building cloud platforms from scratch, hybrid cloud architecture design, or large-scale migration projects (e.g., IDC-to-cloud) is preferred.
- Soft Skills
- Strong problem-solving skills with the ability to handle high-pressure operational challenges.
- Excellent communication skills to collaborate with development, testing, and security teams.
- Certifications & Education
- AWS Certified SysOps Administrator or Alibaba Cloud ACP/ACE certifications are preferred.
- Bachelor’s degree or higher in Computer Science, Network Engineering, or related fields.
Perks & Benefits
- Competitive total compensation
- Comprehensive insurance coverage for employees and their dependants
- More that we love to tell you along the process!
Similar Jobs
What We Do
Founded in 2017, OKX is one of the world’s leading cryptocurrency spot and derivatives exchanges. OKX innovatively adopted blockchain technology to reshape the financial ecosystem by offering some of the most diverse and sophisticated products, solutions, and trading tools on the market. Trusted by more than 20 million users in over 180 regions globally, OKX strives to provide an engaging platform that empowers every individual to explore the world of crypto. In addition to its world-class DeFi exchange, OKX serves its users with OKX Insights, a research arm that is at the cutting edge of the latest trends in the cryptocurrency industry. With its extensive range of crypto products and services, and unwavering commitment to innovation, OKX’s vision is a world of financial access backed by blockchain and the power of decentralized finance.








