At Cisco, we are a global leader in networking and IT, driving innovation and redefining how people connect, communicate, and collaborate. Our mission is to shape the future of the internet by creating unprecedented value and opportunity for our customers, employees, investors, and ecosystem partners. We are committed to encouraging a diverse and partnership environment where everyone can thrive and encourage our collective success.
Your Impact
We are seeking a highly skilled and experienced DevOps, Site Reliability Engineer to join our team, focusing on the development & support of Observability capabilities for workloads across CiscoIT Datacenter and Cloud envs.
- Reshaping how we manage alerts, metrics, and logs by introducing deep learning and GenAI to enhance reliability services.
- You will take ownership & responsibility for reliability, scalability, automation, and other issues related to uptime and availability of our monitoring solutions.
Minimum Qualifications
The ideal candidate will have a strong background in relevant Observability technologies & AI/ML with a proven track record of delivering innovative solutions that enhance system monitoring, performance, and reliability.
- Bachelor's degree in computer science, Computer Engineering, a related field, or 5+ years of relevant experience.
- Understand lifecycle IT processes including architecture, design, implementation, and operations
- Understanding of security including OS hardening, firewalls, iptables, and working with Infosec
- Understanding of network basics like routers and switches
- Experience with software development tools like GitHub and Jenkins
- Python, Shell, Go, or similar programming experience.
- Software development lifecycle including design, development, testing, packaging, deployment, upgrade, and support.
- Opensource development experience.
- Familiar with Agile software development.
- Leadership in building and maintaining SRE technologies.
- Experience with public cloud like AWS, GCP, or Azure.
- QA and testing experience of your code and the entire platform.
Preferred Qualifications
- Experience with tool suites like Splunk Cloud, Splunk Observability Cloud, Elastic, Prometheus/Thanos & Grafana.
- ThousandEyes, Zabbix & AppD or similar experience a plus.
- Experience with JavaScript either Node JS or React.
- Experience with implementing AI/ML & LLM based Agentic Observability use-cases.
- Experience with Infrastructure or Application Performance Monitoring Solutions & Testing experience in a diverse and complex infrastructure.
- Experience with on-premises cloud technologies using VMware or Openstack.
- Experience with container technologies like Openshift, Kubernetes, and Docker.
- Experience with building and maintaining Redhat or Centos Linux.
- Experience with configuration automation using Ansible.
Behavioral Competencies
- Working with geographically distributed teams
- Self-motivated and willing to help where help is needed
- Able to build relationships, be culturally sensitive, have goal alignment, have learning agility.
Why Cisco?
At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.
Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere.
We are Cisco, and our power starts with you.
Top Skills
What We Do
Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network – even the ones they don’t own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues – before they impact end- user experiences.
ThousandEyes is deeply integrated across the entire Cisco technology portfolio and beyond, helping customers deploy at scale while also delivering AI-powered assurance insights within Cisco’s leading Networking, Security, Collaboration, and Observability portfolios.
Why Work With Us
Thousand eyes it a quickly growing company with great opportunities. We empower enterprises to see, understand, and improve digital experiences for their customers and employees. We value professional development, and work with team members to achieve their career goals.
Gallery
Cisco ThousandEyes Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.











