Network monitoring and remote working, like Pb&J

Zabbix alternatives
Cristina De Luca -

August 12, 2021

Since March last year, the 14,000 students at the Pontifical Catholic University of Rio Grande do Sul (PUC-RS) in Brazil have come to depend on online classes. The same reality has been imposed on the institution’s 4.5 thousand teachers and administrative professionals. Besides teaching units, the campus also has nationally concentrated medical centers, such as São Lucas Hospital and the Instituto do Cérebro (InsCer). As in many other institutions around the world, in recent months users have experienced a hybrid context, in which activity has had to be carried out partly remotely, partly in physical environments. This configuration has further increased the demand for network monitoring services, as well as for the entire digital environment responsible for local and remote access.

Committed to ensuring the availability of digital services in this universe, the PUCRS technology infrastructure team worked, at first, with a monitoring platform that required the internal development of sensors to measure the behavior of network components, equipment, and systems. This delayed the schedule, reducing the agility needed to support PUCRS processes. This challenge was overcome with the adoption of a modern monitoring solution.

In today’s world, the term network monitoring is widespread throughout the IT industry. A critical process where all network components, such as routers, switches, firewalls, servers, and VMs, are monitored for faults and performance and continuously evaluated to maintain and optimize their availability. An important aspect of any monitoring is that it must be proactive. Finding performance issues and bottlenecks proactively helps identify problems at an early stage. Effective proactive monitoring can prevent downtime or failures. 

The visualization of the network and sensors through panels allows redirecting equipment and resources from an area of low consumption to supply another. “Students, teachers, and employees want everything to work without worrying about the infrastructure behind, so it is essential to predict and monitor the consumption of resources so that nothing is missing at the end”, Gelson do Amaral, security and infrastructure coordinator at PUCRS.

Proactive digital resource management

The process of network monitoring and management is simplified and automated with the help of network monitoring software and network monitoring tools. From a wide range of network management solutions available, it is important to choose a network monitor system that can handle network bottlenecks and performance issues that can have a negative impact on network performance.

Inside PUCRS thousands of sensors act on several fronts, producing a real-time view of the status of various resources. The solution chosen for monitoring was PRTG Network Monitor, responsible for tracking the performance of 7.5 thousand sensors, analyzing the use of IP addresses, and predicting the need for expansion or relocation of equipment, allowing an infrastructure area to act proactively.

In the case of IP address management, for example, monitoring plays a critical role. “The planning and effective use of IP addresses in each area ensured the functioning of the network that provides access to students, teachers, and staff,” Amaral says. This sensor, in particular, was developed by the supplier for the university.

The system also tracks the historical consumption that users make of the equipment and, with this data, it is possible to predict the growth of the park, with time to allocate investments and change services, without stopping the operation. “The main objective is to keep everything running as long as possible, avoiding that students and teachers face failures and occurrences of asking for support”, Amaral points out.

Besides the 350 servers, an area of the infrastructure of PUCRS is responsible for more than 900 access points, 350 switches, 20 databases, and two data centers. The monitoring controls, for example, CO2 volume, temperature, humidity, air conditioning, and closing doors of the datacenters. The flexibility of PRTG allows one platform to monitor even as student entry and exit turnstiles on campus, as well as remote teaching applications.

During the coronavirus pandemic, which required the offering of distance learning classes, user demand did not increase. Still, the concern to keep everything on air without failures is constant. “There is a chain of mutually dependent configurations that are closely monitored, avoiding compromising the network,” explains Catiuscia.

To achieve results like these, PUCRS conducted PoCs with several market platforms and chose Paessler’s product for requiring less internal development, delivering monitoring capabilities to the university’s needs. “The tool surpassed other monitoring systems for being more complete, easy to manage, and affordable,” says Catiuscia.

Main benefits of network monitoring

  • Clear visibility on the net

Through network monitoring, administrators can get a clear picture of all the connected devices on the network, see how data is moving between them, and quickly identify and correct problems that can hamper performance and lead to outages.

  • Better use of IT resources

The hardware and software tools in network monitoring systems reduce manual work for IT teams. This means that valuable IT staff have more time to devote to projects that are essential to the organization.

  • An initial overview of future infrastructure needs

Network monitoring systems can provide reports on the performance of network components over a defined period. By analyzing these reports, network administrators can anticipate when the organization may need to consider upgrading or implementing a new IT infrastructure.

  • The ability to identify security threats more quickly

Network monitoring helps organizations understand what the “normal” performance of their networks looks like. So when unusual activity occurs, such as an unexplained increase in network traffic levels, it is easier for administrators to identify the problem quickly – and determine if it could be a security threat.

On large networks, such as the one at PUCRS, it is not feasible to simply have potentially thousands (or even tens of thousands) of search engines across the network sending data back to a central monitoring server. Instead, you will need to logically segment your infrastructure. So before you plan your monitoring architecture, remember that you need to understand your environment.

For whatever you want to monitor, there will be multiple measurement points. If you want to monitor the devices themselves, you will need to monitor things like device temperature, fan speed, remaining storage, CPU power, or other metrics that may be relevant.

Obviously, the more measuring points you have, the more processing and planning power is required for your monitoring concept.

Similarly, to give each measurement point a meaning, you need to set limits. Therefore, you not only need to know what you want to measure, but also define an accepted operating range for each component you are monitoring.