When Edge-to-Edge Quality Matters, End-to-End Network Observability is Mission Critical

27
April, 2023
Original post on Internet Telephony

You cannot manage what you cannot measure. It’s a phrase that is especially true when it comes to enterprise networking. The ability to understand what is happening at the edge of the network, down to the device and application level, with great granularity and confidence, is no longer a nice-to-have, but a must-have. That’s especially true with the growing popularity of WFH workforce models.

“Managing a local area network, and related wide area network, was much easier to do when the private, physical network was in one place, or in many places but operated with more consistency, sich as with identical gear and components,” said Steve Pitchon, Chief Client Officer at ConnX AI, an AI-driven MSP that has developed, along with its partners, an integrated approach it calls AI SD-WAN, powered by Active Assurance. “Today, we are responsible for ensuring the highest possible Quality of Experience (QoE) over the Internet, using home routers of all types, computers of all types, and smartphones of all types. While getting there took some foresight and engineering, we have been able to prove that this approach can support contact center agents, dispatchers, doctors, nurses and many other people who are in mission-critical roles.”

Like many of the MSPs that are being called on my more and more organizations, ConnX had no choice but to act fast when the pandemic arrived unannounced a few years ago.

“We were already implementing highly distributed network, data, and applications services then, but the demand for not just basic QoS, but brilliant QoE, accelerated our investment in adding ever more intelligence into everything we delivered. Today, with the popularity of AI and Machine Learning (ML) applications like ChatGPT, there will be even more value generated as chatbots become more sophisticated and human-like, taking highly resilient and fast computation in the cloud and at the edge.”

Observability allows MSPs like ConnX the ability to understand what is happening across their environments and systems in real time, all the time. Instead of relying on siloed point-to-point connections, observability across the entire platform can centralize the deluge of data being generated locally, nationally, and globally, providing those teams responsible for premium service comprehensive visibility into the state of their systems.

“Observability gives us a complete, holistic view of every one of our customers’ systems, including distributed systems, microservices, voice and video collaboration applications, and cloud-based environments,” Pitchon said. “This helps our NOC (News – Alert) teams to detect and diagnose issues across their entire infrastructure quickly and accurately, especially when we have automation built in, meaning the network can often manage itself based on the policies we set for the very large, Fortune 500 enterprises we serve.”

Observability enables MSPs to manage multi-tenant instances, supporting more advanced and valuable applications, including conversational AI and many rapidly emerging solutions that are making contact centers more profitable, for example, and driving confidence that work-from-anywhere approaches do, in fact, work.

“We’ve been so successful by starting up every new customer with a solid understanding of their high-level business objectives,” Pitchon said. “Once we understand their goals– for example, delivering competitive, superior service to customers; securing every conversation for customers in highly regulated industries; controlling what they spend to avoid bill shock they may have experienced in the past with out-of-control cloud services – we know what we need to do, and we know how we can constantly prove we are living up to our promised SLAs.”

By leveraging observability, organizations can optimize their operational and business processes, resulting in shorter mean time to repair (MTTR) and remediation of connectivity and bandwidth issues before they become problematic.

“Observability for us means we don’t have to wait for a call if the network is compromised by a cyberattack,” Pitchon explained. “With AI comes anomaly detection and automation so proven that a single computer registered to the network that shows signs of risk can be detached from the network before an SQL injection attack works its way into sensitive corporate databases.”

Observability platforms offer a real-time view of granular data across the cloud stack, not just the edge.

“This is a very exciting time as with our AI SD-WAN and AI Assure highly trained software. Our customers are positioned to take advantage of new technologies, including conversational AI, human-like avatars that can sense and speak in multiple languages, and other AIOps innovations,” Pitchon said. “Even more so, our observability and automation combined make it possible for us to manage edge workloads, optimize application performance, and generate extremely valuable data for the business on the activities underway 24/7.”

In conjunction with AI/ML-based analytics, observability platforms can analyze vast data sets to deliver valuable insights, Pitchon explained.

“Measuring everything that can affect user satisfaction is essential to ensure the smooth and safe delivery of services to customers,” Pitchon said. “We cannot and do not allow for any declines in experience, much less complete loss of performance, because we understand the impact to the bottom line when real-time communication goes down, even if only for a few minutes. When there is no room for error, nothing can replace extreme observability and not just predictive but prescriptive immediate resolution software.”

Indeed, the cost of downtime has been thoroughly documented. The cost per minute for enterprises averages $9,000, but that is just calculating obvious costs. Deeper and often immeasurable hits to the business include customer impact (damage to the company’s reputation, which can result in churn), productivity (when the daily works grinds to a halt), and employee turnover, which no organizations can afford today in a tough labor market where replacing an employee can easily cost $15,000 or more.

“We’ve gone so far beyond traditional KPIs because we can,” Pitchon said. “Every single aspect of the digital infrastructure we are responsible for can be seen, heard, and increasingly automated when it comes to potential trouble. We were early pioneers in leveraging AI and ML to enable proactive monitoring and workflows to identify and address potential issues before they escalate into a major outage. Doing so is key to stopping disruptions in their tracks. Our observability platform, Maestro, is the underpinning of our AI Active Assurance offering and is fully integrated with observability data flowing into single-pane-of-glass dashboards, alerting systems, and incident management processes.”