Senior Operations Expert FT- SH 高级运维专家 (全职) - 上海

Reposted 12 Days Ago
Be an Early Applicant
Shanghai, Shanghai Municipality, Shanghai, CHN
In-Office
Senior level
Artificial Intelligence • Beauty • Productivity • Software
The Role
The role involves designing and managing cloud-native architectures, optimizing performance, ensuring high availability, and driving technical vision in operations for distributed applications.
Summary Generated by Built In

Role Overview

You are the "architect" and "guardian" of Flowith’s global production environment. In this role, you are not just a firefighter putting out outages, but the cornerstone supporting exponential business growth. You will master the Cloudflare ecosystem and mainstream global cloud infrastructure to design and implement high-concurrency, low-latency distributed architectures. Through extreme performance optimization and a relentless pursuit of automation, you will ensure millions of global users always experience silky-smooth and stable AI interactions.

Key Responsibilities

  • Global Architecture Implementation: Design and manage cross-platform cloud-native architectures, driving multi-region deployment, elastic scaling, canary releases, and rapid rollbacks to ensure the efficient operation of global distributed applications.
  • Traffic & Performance Optimization: Lead the architectural design of managed caching and asynchronous messaging capabilities to seamlessly handle hot caches, task decoupling, and traffic spikes.
  • High Availability & Continuity: Build and continuously optimize the observability system (SLI/SLO and alert governance). Develop and drill backup/recovery, disaster recovery switching, and emergency response mechanisms to defend the baseline of business continuity.
  • Technical Vision & Empowerment: Participate in tech stack selection and architecture reviews for core business features, finding the optimal balance between reliability, security, cost, and maintainability.
  • 全球化架构落地:设计并管理跨平台云原生架构,推进多地域部署、弹性扩缩容、灰度发布与快速回滚,保障全球分布式应用的高效运行。
  • 流量与性能优化:主导托管式缓存与异步消息能力的架构设计,从容应对热点缓存、任务解耦与流量削峰。
  • 高可用与连续性保障:建设并持续优化可观测性体系(SLI/SLO与告警治理),制定并演练备份恢复、容灾切换与应急响应机制,捍卫业务连续性底线。
  • 技术前瞻与架构赋能:参与核心业务的技术选型与架构评审,在可靠性、安全性、成本与可运维性之间找到最优解。

Requirements
  • You build systems that never sleep and automate everything you touch.
  • Hardcore Operations Foundation: 5+ years of SRE/DevOps/Operations experience with battle-tested experience in systems serving millions of users. Solid foundation in Linux and networking (TCP/IP, DNS, HTTP/HTTPS, TLS), and complex troubleshooting skills.
  • Cloud-Native & Edge Master: Deep understanding and proficiency in the Cloudflare ecosystem (CDN/WAF/DNS/Edge Computing) and resource governance of mainstream overseas cloud infrastructure (compute, network, load balancing, storage, managed databases).
  • Automation & Monitoring Enthusiast: Proficient in building and maintaining Prometheus + Grafana monitoring systems. Master of Terraform (or similar IaC) and mainstream CI/CD toolchains. Ability to write handy operational tools using Shell/Python/Go.
  • Architectural Vision: Deep understanding of managed cloud caching and messaging systems (Serverless Redis, queues/event-driven architectures), and hands-on experience in security operations (least privilege, key management, access control, auditing).
  • Bonus: Experience in deploying underlying infrastructure for AI applications, or a strong passion for exploring how Agents/LLMs can empower intelligent operations (AIOps).

需要你:

  • 运维经验:5 年以上 SRE/DevOps/运维经验,曾在百万级/千万级用户规模的系统中身经百战,具备扎实的 Linux 与网络基础(TCP/IP、DNS、HTTP/HTTPS、TLS)及复杂故障排查能力。
  • 云原生与边缘计算:深入理解并熟练使用 Cloudflare 生态(CDN/WAF/DNS/边缘计算),具备海外主流云基础设施(计算、网络、负载均衡、存储、托管数据库)的资源治理经验。
  • 自动化与监控:熟练搭建与维护 Prometheus + Grafana 监控体系;精通 Terraform(或同类 IaC)与主流 CI/CD 工具链,能用 Shell/Python/Go 编写趁手的运维平台工具。
  • 架构视野:深入理解托管式云缓存与消息系统(Serverless Redis、队列/事件驱动),具备安全运维实践经验(最小权限、密钥管理、访问控制、审计)。
  • 加分项:对 AI 应用的底层基础设施部署有经验,或热衷于探索如何利用 Agent/大模型赋能智能运维(AIOps)。

Benefits
  • Workspace, Culture & Lifestyle
    • Awesome Teammates: Work alongside a kind, creative, and hardworking crew of occasional "geeks" and visionaries.
    • Building the AGI Future: Participate in the in-house development of rapidly evolving AI agents and explore the future of AGI interactive interfaces.
    • Cool Offices in SH & SF: Enjoy our ultra-open workspaces with the ultimate freedom to seamlessly switch between our Shanghai and San Francisco locations.
    • Pet-Friendly Workplace: Bring your furry friends to work! Come play with our resident Orange Tabby and Golden Retriever Mix, or bring your own pets to hang out.
    • Island Hackathons: Join our annual internal hackathons, where we select a new city or country each year for innovative coding sessions and team bonding.
    • Free AI Tools & Tech Gear: Enjoy free, unlimited access to cutting-edge AI tools, plus the latest tech equipment like Apple Vision Pro and FPV drones.
    • Tech Events: Regularly participate in top-tier global tech meetups and innovation showcases.
    • Parties & Events: Celebrate with monthly birthday bashes and annual milestone parties
    • Free Snacks & Drinks: Stay fueled with an endless supply of your favorite beverages and unlimited complimentary snacks.
  • Work Arrangements
    • Flexible Working Hours: Customize your schedule by arriving at the office between 10 AM and 1 PM for a standard 8-hour workday, 5 days a week.
    • Remote Work & Care: Embrace a supportive hybrid work model, featuring 1 additional Work-From-Home (WFH) day per month exclusively for female employees.
  • Comprehensive Benefits Package
    • Competitive Compensation: Earn an above-market salary structure with an optional equity/stock options package.
    • Wellness Program: Take care of your body and mind with free gym access and monthly on-site professional massages.
    • Exclusive Swag & Perks: Receive holiday surprise gift boxes, premium custom company apparel (T-shirts, hoodies, and jackets), and occasional exclusive internal brand discounts.

Skills Required

  • 5+ years of SRE/DevOps/Operations experience
  • Solid foundation in Linux and networking
  • Deep understanding of the Cloudflare ecosystem
  • Proficient in Prometheus + Grafana monitoring systems
  • Master of Terraform
  • Ability to write operational tools using Shell/Python/Go
  • Hands-on experience in security operations
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: san francisco, CA
21 Employees
Year Founded: 2023

What We Do

The future isn't about talking to AI. It's about dancing with it. At flowith, we believe in the power of beauty and product excellence. Founded in California with a passion for innovation, our journey began in June 2023 when we set out to create something truly extraordinary. We embraced the philosophy of "product first", dedicating ourselves to crafting an experience that would redefine industry standards. Our commitment to excellence is reflected in our relentless pursuit of perfection. From day one, our team has maintained an impressive pace, releasing new updates daily. This dedication paid off when we officially launched in August 2024, after a year of meticulous refinement and thousands of iterations. We're not just building a tool; we're creating a better way to interact with technology. Our vision is to bring back the magic that once made computers feel truly transformative, and we believe we're on the right path to achieving this goal. As we look to the future, we're excited to announce that we're currently planning our angel round of funding. Having successfully completed a pre-seed round, we're now seeking visionary partners who share our passion for innovation and user-centric design. At flowith, we're more than just a company – we're a team of dreamers and doers, committed to shaping the future of technology. If you're interested in being part of our journey, we'd love to hear from you. Together, we can create experiences that feel magical once again. For inquiries, please contact us at [email protected]. Start flowing today.

Similar Jobs

HERE Technologies Logo HERE Technologies

Customer Success Manager

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Hybrid
Shanghai, Shanghai Municipality, Shanghai, CHN
6000 Employees

Mastercard Logo Mastercard

Manager, Products and Solutions

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Hybrid
Shanghai, Shanghai Municipality, Shanghai, CHN
38800 Employees

HERE Technologies Logo HERE Technologies

Technical Support

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Hybrid
2 Locations
6000 Employees

HERE Technologies Logo HERE Technologies

Technical Program Manager

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Hybrid
2 Locations
6000 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account