Senior Site Reliability Engineer, Fleet - REMOTE

Posted 6 Hours Ago
Easy Apply
Be an Early Applicant
Hiring Remotely in Canada
Remote
116K-160K Annually
Senior level
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Cisco Meraki simplifies powerful technology so that passionate people can focus on their mission.
The Role
The Senior Site Reliability Engineer will ensure the stability and scalability of Cisco Meraki's infrastructure. Responsibilities include automating maintenance processes, debugging failure scenarios, optimizing CI/CD workflows, collaborating with engineering teams, and developing automated tools for data collection and compliance.
Summary Generated by Built In

Cisco Meraki, une division de Cisco Networking, est une entreprise spécialisée dans la gestion infonuagique des technologies informatiques et une cheffe de file dans les solutions de Wi-Fi, de routage et de sécurité contrôlées en nuage. Notre plateforme intuitive permet aux organisations de toutes tailles d’offrir des expériences client et employé à grande échelle. Pour fournir des technologies de pointe à nos clients, nous avons créé une culture d’entreprise inégalée pour nos employés. Une culture où les parcours, perspectives et expériences diversifiés influencent notre travail et alimentent notre évolution. Une culture collaborative, flexible et inclusive qui offre aux employés l'autonomie nécessaire pour développer une technologie accessible et sécurisée pour tous.

Nous sommes à la recherche d’un expert chevronné de la fiabilité des sites qui se joindra à notre équipe dynamique de SRE Fleet, qui sera responsable d’assurer la stabilité, l’évolutivité et l’efficacité de notre infrastructure. Vous jouerez un rôle essentiel dans la maintenance et l’amélioration d’un groupe de plus de 2000 machines dans un environnement cloud mondial. Ce poste est hautement axé sur la collaboration, ce qui implique une interaction étroite avec les équipes d’ingénierie et de SRE au Royaume-Uni et à San Francisco pour faire évoluer et optimiser notre infrastructure.

Responsabilités

  • Développer et maintenir le code d’automatisation pour les processus de maintenance en nuage à l’aide d’Ansible et de Ruby.
  • Déboguer et résoudre des scénarios de défaillance complexes dans des systèmes à grande échelle, assurant ainsi une disponibilité et une fiabilité élevées.
  • Concevoir, implémenter et optimiser les filières GitLab CI pour simplifier les flux de travail de déploiement et de test.
  • Collaborer avec les équipes d’ingénierie pour cerner et résoudre les goulots d’étranglement de la performance et les défis de l’évolutivité.
  • Résoudre les problèmes de manière proactive dans l’ensemble du groupe en tirant parti d'une compréhension approfondie des systèmes et du réseau Linux.
  • Contribuer à la création de tests unitaires robustes et de suites de tests d’infrastructure avec RSpec.
  • Participer à des projets de collaboration pour améliorer l’efficacité, l’évolutivité et l’observabilité des infrastructures.
  • Travailler de manière interfonctionnelle avec des équipes se trouvant dans différents fuseaux horaires, en encourageant une culture de responsabilité partagée et de fiabilité.
  • Développer et maintenir des outils automatisés pour la collecte de données sur l’infrastructure afin de répondre aux exigences de conformité.
  • Simplifier les processus de conformité en réduisant les surdébits manuels grâce à l’automatisation.

Vous êtes un candidat idéal si :

  • vous avez cinq ans ou plus d’expérience en ingénierie de fiabilité de site, en DevOps, ou dans un rôle similaire dans des environnements infonuagiques à grande échelle;
  • vous avez une solide expertise avec :
  • Ansible pour l’automatisation des infrastructures;
  • les cadres de programmation et de test Ruby comme RSpec;
  • l'administration et la résolution de problèmes de sytèmes Linux;
  • les filières CI/CD, en particulier GitLab CI;
  • vous avez une expérience confirmée en résolution de problèmes et en débogage de systèmes distribués;
  • vous avez de l'expérience en gestion et en optimisation de groupes de milliers de machines;
  • vous avez d'excellentes compétences en collaboration et la capacité de travailler efficacement avec des équipes réparties sur plusieurs fuseaux horaires;
  • vous êtes passionné par l'automatisation, l’évolutivité et l'infrastructure en tant que code.

Des atouts supplémentaires si :

  • vous connaissez bien les fournisseurs de services infonuagiques (AWS, GCP ou similaires).
  • vous connaissez des outils de supervision et d’observabilité.
  • vous avez de l'expérience en stratégies de reprise sur sinistre et de haute disponibilité.

Chez Cisco Meraki, nous défions le statu quo grâce à la puissance de la diversité, de l’inclusion et de la collaboration. Lorsque nous mettons en relation différentes perspectives, nous pouvons imaginer de nouvelles possibilités, inspirer l’innovation et libérer le plein potentiel de nos employés. Nous créons une expérience pour les employés qui inclut l’acceptation, l’appartenance, la croissance et des objectifs pour tous.

Cisco est un employeur d’action affirmative et d’égalité des chances, et tous les candidats qualifiés seront pris en considération pour un emploi sans égard à la race, à la couleur, à la religion, au sexe, à l’orientation sexuelle, à l’origine nationale, aux caractéristiques génétiques, à l’âge, à l’invalidité, au statut d’ancien combattant ou à tout autre motif protégé par la loi. Cisco envisagera pour l’emploi, au cas par cas, des candidats qualifiés avec des dossiers d’arrestation et de condamnation.

                                       

                                         ****************************************************************************************************


Cisco Meraki, a division of Cisco Networking, is a cloud-managed IT company and leader in cloud-controlled Wi-Fi, routing, and security. Our intuitive platform enables organizations of all sizes to deliver customer and employee experiences at scale. To provide best-in-class technologies to our customers, we’ve created an unrivaled company culture for our employees. One where diverse backgrounds, perspectives, and experiences shape our work and fuel our evolution. One that is collaborative, flexible, and inclusive and provides employees with the autonomy to develop technology that’s accessible and secure for everyone.

We are seeking a Senior Site Reliability Engineer (SRE) to join our dynamic SRE Fleet team, which is responsible for ensuring the stability, scalability, and efficiency of our infrastructure. You will play a critical role in maintaining and improving a fleet of over 2000+ machines across a global cloud environment. This role is highly collaborative, involving close interaction with engineering and SRE teams in the UK and San Francisco to scale and optimize our infrastructure.

Responsibilities

  • Develop and maintain automation code for cloud maintenance processes using Ansible and Ruby.
  • Debug and resolve complex failure scenarios across large-scale systems, ensuring high availability and reliability.
  • Design, implement, and optimize GitLab CI pipelines to streamline deployment and testing workflows.
  • Collaborate with engineering teams to identify and address performance bottlenecks and scaling challenges.
  • Proactively troubleshoot issues across the fleet, using a deep understanding of Linux systems and networking.
  • Contribute to the creation of robust unit tests and infrastructure testing suites with RSpec.
  • Participate in collaborative projects to improve infrastructure efficiency, scalability, and observability.
  • Work cross-functionally with teams in different time zones, fostering a culture of shared ownership and reliability.
  • Develop and maintain automated tools for collecting infrastructure data to support compliance requirements.
  • Streamline compliance processes by reducing manual overhead through automation.

You are an ideal candidate if you:

  • 5+ years of experience in Site Reliability Engineering, DevOps, or a similar role in large-scale cloud environments.
  • Strong expertise in:
  • Ansible for infrastructure automation.
  • Ruby programming and testing frameworks like RSpec.
  • Linux systems administration and troubleshooting.
  • CI/CD pipelines, particularly GitLab CI.
  • Demonstrated experience troubleshooting and debugging in complex distributed systems.
  • Experience managing and optimizing fleets of thousands of machines.
  • Excellent collaboration skills and the ability to work effectively across teams in multiple time zones.
  • Passion for automation, scalability, and infrastructure as code.

Bonus points for:

  • Familiarity with cloud providers (AWS, GCP, or similar).
  • Knowledge of monitoring and observability tools.
  • Experience with disaster recovery and high availability strategies.

At Cisco Meraki, we’re challenging the status quo with the power of diversity, inclusion, and collaboration. When we connect different perspectives, we can imagine new possibilities, inspire innovation, and release the full potential of our people. We’re building an employee experience that includes appreciation, belonging, growth, and purpose for everyone.

Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis. Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.


Compensation Range:

$115,900$160,400 USD

Message to applicants applying to work in the U.S. and/or Canada: 
When available, the salary range posted for this position reflects the projected hiring range for new hire, full-time salaries in U.S. and/or Canada locations, not including equity or benefits. For non-sales roles the hiring ranges reflect base salary only; employees are also eligible to receive annual bonuses. Hiring ranges for sales positions include base and incentive compensation target. Individual pay is determined by the candidate's hiring location and additional factors, including but not limited to skillset, experience, and relevant education, certifications, or training. Applicants may not be eligible for the full salary range based on their U.S. or Canada hiring location. The recruiter can share more details about compensation for the role in your location during the hiring process.

U.S. employees have access to quality medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, short and long-term disability coverage, basic life insurance and numerous wellbeing offerings.

Employees receive up to twelve paid holidays per calendar year, which includes one floating holiday (for non-exempt employees), plus a day off for their birthday. Non-Exempt new hires accrue up to 16 days of vacation time off each year, at a rate of 4.92 hours per pay period. Exempt new hires participate in Cisco’s flexible Vacation Time Off policy, which does not place a defined limit on how much vacation time eligible employees may use, but is subject to availability and some business limitations. All new hires are eligible for Sick Time Off subject to Cisco’s Sick Time Off Policy and will have eighty (80) hours of sick time off provided on their hire date and on January 1st of each year thereafter.  Up to 80 hours of unused sick time will be carried forward from one calendar year to the next such that the maximum number of sick time hours an employee may have available is 160 hours. Employees in Illinois have a unique time off program designed specifically with local requirements in mind. All employees also have access to paid time away to deal with critical or emergency issues. We offer additional paid time to volunteer and give back to the community.

Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components. For quota-based incentive pay, Cisco typically pays as follows:

.75% of incentive target for each 1% of revenue attainment up to 50% of quota;

1.5% of incentive target for each 1% of attainment between 50% and 75%;

1% of incentive target for each 1% of attainment between 75% and 100%; and once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation.

For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid.  

Top Skills

Ruby

What the Team is Saying

Priya Selvarathinavel
Lena Zell
Corey Dettmann
Robbie Singley
Melody Bonet
May Chhom
The Company
HQ: San Francisco , CA
3,000 Employees
Hybrid Workplace
Year Founded: 2006

What We Do

Meraki is a Greek word meaning “something done with soul, creativity, or love.” With this name as our mantra, we’re building a welcoming workplace that attracts eclectic, curious, purposeful people who unite to ignite our customers’ passions. Together, we create powerful, simple technology with the potential to change everything.

Why Work With Us

We believe that when passionate people are able to spend less time struggling with technology, they can spend more time on what matters—like teaching kids, running businesses, keeping airports safe, and connecting disaster victims with relief. That’s the real power of simplicity.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

Cisco Meraki Teams

Team
Meet Meraki's Engineering Team
About our Teams

Cisco Meraki Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Flexible
Company Office Image
HQCisco Meraki San Francisco Office
Company Office Image
Cisco Merak Bengaluru Office
Company Office Image
Cisco Meraki Mexico City Office
Company Office Image
Cisco Meraki Chicago Office
Company Office Image
Cisco Meraki London Office
Cisco Meraki Sydney Office
Learn more

Similar Jobs

Cisco Meraki Logo Cisco Meraki

Technical Leader, MX Networking

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
Canada
3000 Employees
134K-173K Annually

Cisco Meraki Logo Cisco Meraki

Senior Software Engineer, MX Networking

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
Canada
3000 Employees
116K-160K Annually

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account