We are seeking a high-caliber Senior Site Reliability Engineer (SRE) based in California to ensure the scalability, reliability, and runtime efficiency of our next-generation platform. In this role, you will bridge the gap between development and operations, working closely with our global engineering teams.
We are looking for a unique engineering mindset: someone who brings a positive, collaborative energy to the daily grind, but can instantly pivot into a hyper-focused, high-ownership responder when an incident strikes.
Key Responsibilities- Production Reliability & Guardrails: Partner with the Platform Engineering team to implement reliability guardrails, ensuring applications running on AWS meet strict uptime and SLA requirements.
- CI/CD & Repository Management: Own the deployment pipelines and code management practices extensively via GitHub.
- Incident Management: Lead rapid-response troubleshooting during production incidents; conduct thorough blameless post-mortems to continuously harden our systems.
- Observability & Performance: Implement advanced monitoring, logging, and alerting systems to proactively detect and mitigate system anomalies.
- Cross-Border Collaboration: Act as a key technical bridge between our US operations and international engineering hubs, leveraging bilingual communication to streamline complex technical alignment.
Requirements1. Technical Focus
- Ecosystem Expertise (Must-Haves): Deep, practical experience managing application deployment and runtime environments on AWS, alongside master-level knowledge of advanced Git workflows and actions on GitHub.
- Core Toolkit: Strong proficiency in monitoring tools, log management, and scripting for quick triaging and troubleshooting.
- Ownership & Transparency: You are radically open, highly responsive, and communicative. You don't just clear tickets; you own the production environment's health end-to-end.
- Pressure-Resistance: High psychological resilience. You maintain a happy, positive attitude during smooth operations, yet feel a healthy, driving sense of urgency and laser-focus during high-stakes incidents.
- Bilingual Capability: Absolute fluency in Mandarin and English (verbal and written) is mandatory for effective technical alignment across our global teams.
Benefits
- Competitive base salary + equity packages aligned with California market standards.
Skills Required
- Deep, practical experience managing application deployment and runtime environments on AWS
- Master-level knowledge of advanced Git workflows and GitHub Actions
- Ownership of deployment pipelines and repository management (CI/CD)
- Strong proficiency with monitoring tools, log management, and scripting for triage and troubleshooting
- Lead rapid-response incident management and conduct blameless post-mortems
- Absolute fluency in Mandarin and English (verbal and written)
- Based in California
- Senior-level Site Reliability Engineer experience
What We Do
Kody is on a mission to make in-person payment acceptance easy. Today, paying in person presents common problems for businesses, such as high costs, long queues, and limited choice of payment methods. Kody fully integrates the payment ecosystem. This way, businesses can offer customers more control over their payment choices to make transactions quicker and simpler. Founded by a small group of final-year high school students and launched in July 2022, 24-year-old founder Yoyo Chang (CEO) studied at the University of Cambridge & York whilst raising US$10M. Today, Kody's platform is growing to connect millions of end-users with venues all over the world.







