How to optimize UC monitoring services

Cristina De Luca -

April 18, 2021

The demands of the post-Covid workplace have shown how important a secure and efficient Unified Communications (UC) solution is to business recovery.

Before the Covid-19 pandemic, few people regularly worked from home. At the height of the lockdowns, however, the home office became the only way to keep the business running. And many organizations will continue to adopt a hybrid working model for some time to come. A Gartner survey of 317 CFOs found that 74% of them plan to permanently switch to more remote work from now on.

If distributed teams must collaborate effectively, communication will be critical. Having to switch between multiple apps, tabs, emails, and notifications just to find the right information is too much of a hassle for employees and can undermine productivity.

In recent months, enterprises have increasingly turned to Unified Communications as a Service (UCaaS) to keep distributed teams connected and remain competitive in the digital economy. Overnight, practically, UCaaS has gone from being just another interesting investment to becoming a fundamental pillar for business success in most industries.

According to a recent report by Avant Research & Analytics, overall interest in UCaaS increased 86% immediately following the Covid-19 pandemic. This increase in demand raises a critical question for business leaders: “How do we optimize UCaaS quality of service monitoring and avoid negative business impacts, since the resource-intensive video, VoIP, instant messaging, and email can also cause difficulties for services that include higher bandwidth requirements?”

To provide a positive end-user experience, IT teams need comprehensive unified communications monitoring and analytics (UCM) solutions that provide actionable insights from the context of all system components and interactions. Several things can negatively impact UC performance, including dropped packets, latency issues, and retransmissions.

Which tools are essential?

Maintaining high-quality unified communications and maximizing control over the use of available resources requires high-quality monitoring with the ability to work in a multi-vendor environment.

A very common method of monitoring is to enable SNMP (Simple Network Management Protocol), although not all UC hardware has SNMP capabilities.

But for those who need to dig deeper into the performance of their unified communications infrastructure, a more sophisticated tool is required. This is where Voice / Video Quality Monitoring (VQM) systems come in, able to provide network administrators with a comprehensive view of the health and performance of their unified communications platform.

VQM is an advanced monitoring methodology used to mitigate and resolve issues affecting the performance of UC platforms before they become catastrophic.

There are three basic categories of performance-related problems that can occur in enterprise IP telephony:

  1. IP network problems, such as instability, packet loss, and delay;
  2. Equipment configuration and signaling problems;
  3. Analog / TDM interface problems, which include things like echo (signal reflection) or signal level.

Network architects and managers need to address performance management and call quality issues during the planning and deployment phases but should be aware that these problems also occur regularly during normal day-to-day network operations post-deployment.

Many quality-related UC problems are short-lived, temporary in nature, and can occur anywhere along the network path. For example, a user accessing a file from a server or a homeworker watching a YouTube video can cause a temporary and brief bottleneck.

This can cause short-term degradation in call quality for other users on the network. It is therefore important for network managers to use performance management tools, such as those provided by VQM, that are able to detect and measure these types of network impairments.

In addition, the temporary nature of IP handover problems also means that these problems are not easily detected or reproduced. The problems are not necessarily associated with specific cables or line cards – they can occur randomly due to “crashes” or combinations of many different factors.

Network managers can try to use packet loss and jitter metrics to estimate call quality, but this approach is reactive. It doesn’t really correlate with end-user experience, and even more importantly, it doesn’t provide enough information to determine the cause of a problem.

What to do?

Proper troubleshooting generally moves in the following order, according to Anthony Caiozzo, director of sales at Telchemy, a provider of unified communications management solutions:

  1. Check session MOS scores – these are typically reported both by session and direction. When scores are significantly degraded, it is worth examining the calls made from the location in question over a specific time period to quickly identify the common degradation factor.
  2. Further, correlate the session degradation factors to determine where to go next. The sources of degradation can be used to effectively isolate session deficiencies to help guide troubleshooting efforts.
  3. Use the results of this second step to extract statistics by session and direction and gain a better understanding of the underlying problems perceived by the point of measurement/end-user. Some commercially available solutions provide expert per-call analysis to assist in the effective interpretation of low-level diagnostics.
  4. Compare UC sessions that have traveled similar paths during the same time period to confirm whether other sessions have experienced similar symptoms and performance.
  5. Use these results to perform corrective actions and rectify network and application performance.
  6. Continue to monitor the input performance data points, both active and passive, to verify that the identified problem has been corrected.