Integrated monitoring is the most important AIOps feature

monitors in control room
Cristina De Luca -

July 31, 2023

The complexity of IT systems and the continued exponential growth of monitoring data have driven the improvement of business insights with AIOps, defined by Gartner as the use of Big Data and Machine Learning to automate IT operations processes such as event correlation, anomaly detection, performance analyses and the current automation of workflows.

By Research and Markets’ reckoning alone, we’re talking about a global market that is expected to grow from $5.82bn in 2022 to $7.4bn in 2023 at a compound annual growth rate (CAGR) of 27.1%. And the market is projected to reach an estimated $19.79bn by 2027.

Looking ahead, it seems that the future of IT is very much aligned with AIOps and its ability to reduce operational costs and ensure higher service quality, increasing customer satisfaction (internal and external) by helping to extend service quality and reduce downtime.

How to start implementing AIOps

There is no cake recipe for implementing AIOps. But there are some basic aspects to consider. IBM, for example, works with three: observability, predictive analytics and proactive response.

Observability is related to the collection, aggregation and analysis of operational data, generated by monitoring. It is considered the foundation for the development of AIOps. Integrated monitoring was ranked as the most important feature of an AIOps solution, cited by nearly 55% of respondents in a recent OpsRamp study. Monitoring tools are still the ideal approach to detecting performance issues or IT outages. And in this particular instance, the study found that companies are doing a better job at controlling tool proliferation, with more than half of respondents (56.5%) having fewer than 10 monitoring tools. Yet a third of companies are using 10-19 tools and another 10% have more than 20!

 Domain-centric IT infrastructure monitoring tools is still the go-to approach for detecting performance issues or IT outages
Source: OpsRamp

Part of the added AI solutions are used to analyze and correlate data to get a more accurate understanding, getting the basic idea of the AIOps concept.

Predictive analytics enables IT professionals to sustain control over complex processes and ensure the performance of IT operations. Examples of predictive analytics practices are anomaly detection, alerts, recommendations, and optimization of IT performance.

From there, AI solutions are used to analyze and correlate data to proactively respond to unexpected events, in private outages or slowdowns. The goal is to maintain the performance of the IT infrastructure while planning, scheduling and allocating resources. Predictive algorithms can recognize patterns and trends in performance metrics, and thus predict and prevent problems before they arise.

By integrating multiple disparate manual IT operations tools into a single, intelligent and automated IT operations platform, AIOps enables IT operations teams to respond quickly and proactively to slowdowns and disruptions with far less effort.

According to the OpsRamp study, conducted in December 2022, more than 60% of the 265 C-levels surveyed were already adopting AIOps to improve the availability and performance of services and applications. The second and third top choices were for operations automation (58%) and processes (54%).

The biggest IT operations challenge in 2023 is automating as many operations as possible, cited by 66% of respondents. However, only half of respondents (52 per cent) cited automation of tedious tasks as their top operational benefit of AIOps, ranking behind reducing open incident tickets (65 per cent) and reducing MTTD and MTTR (56 per cent).

Improvements in automation are clearly the main concerns for businesses in 2023.

Dependency mapping needs to be handled at the monitoring stage before you can adopt AIOps.
Source: OpsRamp

Benefits

The main benefit of AIOps is that it enables IT teams to identify, address and resolve slowdowns and outages faster than would be possible by manually triaging alerts from different IT operations tools. This results in:

  • Faster mean time to resolution (MTTR): By differentiating alerts from IT operations and correlating operations data from multiple IT environments, AIOps can identify root causes and propose solutions faster and more accurately. This enables organisations to set and achieve previously unthinkable MTTR targets.

  • Reduced operational costs: automatic identification of operational issues and rescheduled response routings reduce operational costs, allowing resources to be allocated more appropriately. This also frees up staff resources to work on more innovative and complex activities, resulting in improved employee experience.

  • More observability and better collaboration: The integration available in AIOps monitoring tools enables more effective collaboration between DevOps teams, ITOps, management and security functions. The improved visibility, communication and transparency enables these teams to make better decisions and respond to issues quickly.

  • Move from reactive to proactive to predictive management: With built-in predictive data analytics capabilities, AIOps continuously learns to identify and prioritise the most urgent alerts, enabling IT teams to manage potential issues before they result in delays and downtime.

For all these reasons, the AIOps approach bridges the gap between an increasingly complex, dynamic and difficult-to-monitor IT landscape and user expectations for little or no disruption to application performance and availability.

What to worry about when choosing a supplier

Most experts consider AIOps to be the future of IT operations management and demand is growing steadily with increased enterprise focus on digital transformation initiatives. With the global economy facing headwinds on several fronts, including inflation, the cost of living crisis, higher interest rates and war in Ukraine, many businesses are focussed on improving IT efficiency and automation in 2023.

These organisations will need help (from service providers) to keep their IT infrastructure running smoothly. What should AIOps custo

  • Be efficient – Organisations seeking deep real-time insights, with auto-correct capabilities across technology stacks ranging from mainframe to mobile, cannot afford deficiencies in any of the various AIOps capabilities. Significant deficiencies in even a few features can make an AIOps solution limited and require expensive integrations that need regular maintenance. Strong native features allow for reduced technology debt, while an extensive library of off-the-shelf (OOTB) third-party connectors will reduce bespoke integration costs.

  • Provide transparent and reliable automated remediation capabilities – Automated remediation is largely limited by lack of trust from corporate buyers and doubts about transparency in vendor solutions. Companies are successfully using solutions to automate simple, repetitive operational tasks and corrections, but are slowly testing more complex actions for reliability. A deeper understanding of how AI/ML engines leverage sensory and telemetry data will help increase adoption of these capabilities. For this to happen, solutions must work even harder to dispel fears by exposing how they make automation decisions and what they base them on, gaining the confidence to execute autonomously and enabling pathways for adoption.

  • Provide insights from the beginning to the end of a transaction – It is no longer sufficient to map services based on technology infrastructure alone. To truly enable improvements in user experience, reduce mean time to resolution (MTTR) and mean time to identification (MTTI), and ultimately enable business to grow, efforts must start with a focus on user experience. Identifying service degradation or outages should be from the users’ perspective, not just the infrastructure. Business context and cycles, usage patterns and historical trends are vital data elements to proactively address potential negative impacts on user experience. Solutions need to act on this in real time and provide these elements to architecture, development and design teams for core system enhancements.

Other key findings of the OpsRamp study

  • Application-to-infrastructure dependency mapping is the top incident management challenge for enterprises, cited by 64% of total respondents.

  • Smart alerting is the main use case for AIOps today, adopted by 70% of companies.

  • The vast majority of AIOps implementations – more than 80 per cent – take six months or less.

  • Data accuracy was respondents’ biggest concern about AIOps, cited by 62 per cent of companies.

  • AIOps is creating jobs, not killing them, although engineers with the right skills for AIOps remain hard to find. Only 36% of respondents were concerned about AIOps deployment causing job losses, while 68% said it takes more than six months to hire engineers with the right skills for AIOps.

“The study shows that AIOps is real and offers tangible benefits to businesses and MSPs,” said Suresh Vobbilesetty, executive vice president of engineering at OpsRamp. “But it also shows that organisations’ AIOps initiatives remain a work in progress and have a long way to go before they can realise the technology’s full potential,” he added.