Cisco ThousandEyes
Affiliated Profiles
Cisco ThousandEyes Innovation, Technology & Agility
Cisco ThousandEyes Employee Perspectives
What’s your rule for releasing fast without chaos — and what KPI proves it?
To release quickly without chaos, it’s essential to establish automated testing and real-time alerting at the individual service level. As our services grow more complex, predicting every possible interaction in advance becomes unmanageable. Instead, each service should define clear health metrics — such as error rates, response times and availability — and continuously monitor these indicators. By injecting continuous baseline synthetic traffic (independent of real users), we can proactively detect degradation or failures before they impact customers, even for transitive dependencies. Fast, automated rollback mechanisms further reduce risk, enabling confident, rapid releases. The KPI I use to prove this out is mean time to recovery.
What standard or metric defines “quality” in your toolchain?
For me, “quality” in a toolchain is defined by architectural simplicity and the ability to reason about system behavior. I apply a concept similar to cyclomatic complexity — not just to code, but to system architecture as a whole. By modeling services and their interactions as a graph, I assess the impact of potential outages or degradations, not only for each service but also for downstream dependencies. Each connection in this graph represents more than just up/down status; it includes metrics like latency, retry behavior and resource saturation. For example, Bufferbloat illustrates how local optimizations (like buffer sizes) can have unexpected system-wide effects. High architectural complexity makes systems brittle and harder to maintain, while simplicity enhances reliability. Ultimately, quality is reflected in how easily we can understand, reason about and operate the system.
Share one recent adoption and its measurable impact.
Recently, I’ve expanded my use of AI beyond code assistance to non-coding tasks, particularly for technical document review. By leveraging AI to summarize and research referenced technologies, I significantly reduce the time spent on background research and ramp-up, allowing me to focus on the critical parts of the proposal. Additionally, I use AI to summarize chat discussions, as well as to serve as a first-pass editor for my own writing. Summarization and grading are strengths of current LLMs, so I’m finding it very effective to take on a collaborative approach to using AI in that respect. While integration across all tools is still developing, the measurable impact has been a noticeable increase in productivity and efficiency — often reducing document review and summarization time by 25 to 30 percent. I’m optimistic about further productivity gains as AI tools mature.
