About the Role
Gradion is expanding its SRE team for the a client with a long-term managed services. You will be part of a global, follow-the-sun SRE function, responsible for platform stability, cloud infrastructure, and incident response.
This role suits engineers who are technically solid, self-directed, and comfortable operating in a fast-moving, internationally distributed environment. You will go through a structured onboarding alongside an internal SRE team before taking on independent operational responsibility.
What You Will Do
Own platform availability: monitor, triage, and resolve incidents within defined SLA windows
Manage cloud infrastructure on AWS and/or GCP - provisioning, scaling, and day-to-day operations
Maintain and improve CI/CD pipelines and GitOps workflows
Operate observability systems: monitoring, logging, and alerting at production scale
Participate in on-call rotation as part of the global follow-the-sun coverage model
Configure, deploy, and manage AI tooling and MCP servers in production environments
Contribute to infrastructure automation, scripting, and internal tooling
Write clear post-incident reviews and contribute to the monthly operational report
Collaborate closely with engineering teams across multiple time zones
What You Bring
4+ years in a DevOps / SRE / Platform Engineering role within an international team
Solid Kubernetes knowledge - cluster operations, troubleshooting, and configuration
Hands-on cloud experience with AWS and/or GCP
Good understanding of networking fundamentals - DNS, load balancing, firewalls, VPC
Scripting and automation skills (Python, Bash, or similar)
Experience with CI/CD tools and GitOps-based delivery
Working knowledge of monitoring and observability systems (Prometheus, ELK, or equivalent)
Good English - daily communication with European stakeholders is a core requirement
Self-directed and proactive - you ask the right questions and drive issues to resolution without waiting to be told
Nice to Have
Experience configuring and managing MCP servers and AI tooling in production
Exposure to AI enablement workflows or LLM infrastructure
Background supporting eCommerce or SaaS platforms
Familiarity with the Frontastic / commercetools Frontend ecosystem
🏆 Join Vietnam’s Best IT Company - Gradion Vietnam (formerly NFQ Vietnam) was recognized by ITViec for 8 consecutive years, including 2 successive years as the Winner. Work with some of the best minds in the industry and be part of a company that’s redefining how businesses scale through technology.
🌍 Career Growth & Leadership Development - Work closely with our leadership team, gain mentorship from experienced executives, and have direct exposure to high-level strategic decisions. Your growth is limitless, as long as you’re ready to step up, opportunities will always be there for you.
🚀 AI-First Engineering & Strategic Consulting - Our engineering culture integrates AI as a core driver of design, development, and optimization - not an add-on. As a forward-thinking consultancy, we go beyond traditional engineering, combining technical excellence with a strategic mindset to deliver transformative solutions for ambitious businesses.
💰 Competitive Compensation - We believe great talent deserves great rewards. Expect an attractive salary, performance-based bonuses, and a benefits package that reflects your impact. We value talent over salary budgets - exceptional contributions deserve exceptional rewards.
✨ And Many More Benefits to Explore! But most importantly, a healthy work-life balance and an environment where you can thrive professionally and personally. Including:
- Performance bonus of up to 2 months’ salary.
- Performance review twice a year, so your growth is recognized and rewarded.
- Premium healthcare for you, plus an annual health check.
- 15 days of annual leave.
- Full salary during probation.
- Hybrid working for real flexibility.
- Monthly Happy Hour and Community Tech activities.
- Work on global projects as part of an innovation team that shapes ideas for the hi-tech world.
- Diverse training programs to keep you growing.
Skills Required
- 4+ years in a DevOps / SRE / Platform Engineering role within an international team
- Solid Kubernetes knowledge - cluster operations, troubleshooting, and configuration
- Hands-on cloud experience with AWS and/or GCP
- Good understanding of networking fundamentals - DNS, load balancing, firewalls, VPC
- Scripting and automation skills (Python, Bash, or similar)
- Experience with CI/CD tools and GitOps-based delivery
- Working knowledge of monitoring and observability systems (Prometheus, ELK, or equivalent)
- Fluent English (C1 minimum)
What We Do
At Gradion, we consult businesses with Digital & Deep Tech. With over 23 years of experience and a global team of 300+ experts across 7 countries, our impact spans e-commerce, retail, mobility, travel & logistics, and intralogistics, where we’ve driven bold, measurable transformative technology solutions. Powered by the AI-first foundation, our journey is defined by customer stories of bold transformations in Business, Technology and People. We've partnered with visionary companies to architect the strategies, systems, and technologies that propelled them to extraordinary success. From large-scale platforms to AI, data, cybersecurity & robotic solutions, Gradion delivers tailored, future-ready solutions, empowering the next generation of Billion Dollar Companies.







