Successfully Implementing AIOps
What AIOps Does for You
AI, a modern and advanced form of data processing, does what traditional data processing cannot do. It mines data for insights and turns them into actionable situational intelligence that turbocharges human actions and automation. AI should anticipate your needs and provide relevant actionable guidance and/or results. AI should assist you by focusing you on what is most important at any given moment. It should also provide explicit actions to take to solve an actual problem or a problem that is likely to occur. AI does indeed do these things; it augments your skills and abilities by providing intelligently assistance and automation tasks. It improves your efficiency, efficacy, and hence your work-life balance too.
Applying this to IT operations, AI anticipates your need for visibility into your infrastructure and provides it. AI focuses you on the problem that matters most and generates a single priority alert versus overwhelming you with many alerts. Consolidating alerts from multiple issues related to the same problem greatly reduces alert noise and fatigue and it increases your efficiency.
Included with alerts, when possible, are specific corrective actions to take that will resolve the problem. When possible, the corrective actions are automatically executed. Whether intelligently assisted or intelligently automated, AI for IT operations (AIOps) drives IT operations to be as efficient and effective as possible.
AIOps Turbocharges Human Actions and Automation
In addition to making you more agile, efficient, effective, and less stressed, AIOps also enables levels of automation that would not otherwise be possible. Automation lessens your workload, further increasing your overall efficiency. It reduces troubleshooting time and corresponding mean time to resolution (MTTR), often to zero, by automatically identifying the root cause.
Extending intelligent assistance and intelligent automation to your colleagues makes the entire IT operations team highly productive, agile, efficient, and accurate – leaving more time to add value to your organization and enjoy your time away from work. Corrective actions called prescriptive recommendations are provided and are matched to resolve specific problems that substitute best practices for guesswork that reduce operational and business risk. Lowering guesswork also lowers stress because you don’t have to wonder and worry if you created a “ticking timebomb.”
With this appreciation for what AIOps can do, let’s explore how to successfully implement AIOps for your organization starting off with understanding the pitfalls that hinder your success.
IT Program Success
For decades, reports have been published revealing that the percentage of IT programs that fail are between 60% and 80%. According to McKinsey & Company, there are a few primary reasons that IT programs fail. One reason, “pilot stall”, occurs when the value derived from a program is limited, such that there is not a convincing business case to continue. Oftentimes the lack of value is because a program’s initial “phase-1” objectives are too aggressive. So, establishing and clearly communicating realistic and achievable objectives that clearly demonstrate business value are critical to success.
AIOps is an aggressive program and therefore faces those same failure risks. It must start with a clear vision and direction, just like any other IT initiative. Technology risks include data management; big and fast data must be harnessed and piped to a data lake and/or an AIOps platform. If it is an organization’s first foray into harnessing data, it’s a good strategy to limit scope, start small, and not try to integrate data from multiple siloes at first. Infrastructure issues regarding on-premises, cloud, hybrid, distributed, multi-tenancy, etc. will arise, so in this regard keeping it simple is advised to avert as many risks as possible that can impede reaching a successful rollout that provides real business value.
Common Pitfalls and Critical Success Factors for AIOps Initiatives
Below are the most common reasons for delays and lack of success.
- Not having clearly defined goals, or so-called SMART goals (i.e., each goal must be specific, measurable, actionable/achievable, relevant, and time-bound). Despite the name, Artificial Intelligence, omnipotence is not possible. What is possible solving specifically defined and scoped problems with analytics and Machine Learning applied to big and fast data.
- Scope and scale. Less scope and scale invariably reduce risk, so it is often best to start off with scope that is small, obtain results early (or fail early and adjust the plan). A previous blog on AIOps (“Network Data Is Essential for AIOps”) recommends a “crawl-walk-run” plan starting off using only network data at the outset, then adding additional sources in subsequent phases.
- Data. There are several pitfalls when it comes to data. All the following must be addressed before an AIOps solution can be implemented and provide value. The same blog mentioned in the preceding paragraph also advocates starting with a specific data set, such as network data (especially because of its richness) to further limit scope and scale risks.
- Not having the data (or data features) needed to solve the problem
- Not having access to the data (typically because it is locked in silos)
- Not having the volume of data needed to train machine learning models
- Data not of high enough quality and/or not having the ability to clean and/or normalize the data prior to ingestion and processing by the AIOps-specific analytics and models
- Data governance and provenance
- Organizational culture and acceptance of the solution. Some programs fail because of resistance to use a solution. Including the right stakeholders from the onset of the process garners early buy-in that increases acceptance and use of a solution.
How AIOps Augments, Assists, Automates
AIOps applies advanced analytics, machine learning (ML), and big data processing methods to IT operational problems and workflows. Diagnostic and descriptive analytics give you actionable situational intelligence about what happened and is happening complemented with context that includes when, where, and why. Predictive analytics identifies likely future problems and impact that can be proactively averted. Once the situation is understood, prescriptive analytics come into play by providing recommended actions. Correlating data from multiple sources fortifies the insights, context, and intelligence, plus consolidates multiple alerts from the symptoms of a problem to just one alert for the problem itself. Consolidating alerts and directing them only to the most appropriate recipients truly turbocharges efficiency because too many alerts can be counterproductive, even paralyzing. So, AIOps does what you would do – troubleshoot to understand the problem, notify the most appropriate team or people to solve the problem, and take the most appropriate corrective action.
When incidents and problems occur, MTTR is reduced because the AIOps solution recommends or automates the corrective actions without interference from alert noise. More than just a time-saver, AIOps also increases the use of best practices versus guesswork, thereby improving effectiveness and lowering operational and security risks to help ensure maximal business continuity.
AIOps Requires Reliable Data
As noted earlier in the section about pitfalls and critical success factors, data is the fuel for training models and for real-time results and desired outcomes. Desired outcomes depend on reliable quality data that is consistent, complete (i.e., without gaps), and accurate. Conversely, garbage-in, garbage-out; and nothing-in, nothing-out.
AIOps uses machine learning, advanced analytics, and other big data processing methods such as correlation to develop insights, intelligence, and understandings of situations – hence the phrase “situational intelligence” is often used. When a situation is understood, it’s possible to consolidate and prioritize alerts, automate the resolution if possible, or otherwise drive an efficient and effective fix by providing the root cause and corrective actions.
Network packet data is rich and therefore necessary for a successful AIOps initiative. Obtaining quality network data is more straightforward than acquiring other forms of data or acquiring data from other sources. The analytics and models need to be tuned to analyze network data, especially if in packet format. Preprocessing by a network packet broker such as cVu® will offload some of the analytics and models by providing only the data needed.
Industry Standard Network Monitoring and Visibility for AIOps
Using solutions, such as provided by cPacket Networks, that capture, store, and make real-time and historical data available is critical to drive AIOps solutions. Data is “tapped” or acquired from strategic locations within physical and virtual networks using appliances such as multi-port/multi-speed TAPs or virtual TAPS in virtualized and cloud environments. The data is centralized into a monitoring plane using network packet brokering that flexibly delivers the right real-time data to an AIOps solution. Capture-to-storage devices can store data at wire-speed that can be polled by the AIOps solution or an intermediate (ETL) data pipeline for use cases that operate on historical data.
Conclusion
AIOps provides value to the IT operations team and to the organization. However, implementing AIOps is a project that is large in scope and scale. Both must be managed and ideally implemented in a stepwise manner to ensure short-term and long-term success of an AIOps initiative. Since data is the fuel for AIOps, addressing all aspects of data at the outset of an initiative such as this is critical.
Using network packet data is a good starting point to manage scope and scale data issues, and overall project risk. Many programs that change work processes and workflows fail because of acceptance. Including and communicating to as many stakeholders, including end-users, as possible is a critical success factor that goes beyond making the technology work.
Author
Nadeem Zahid
Chief Marketing – Mach 01

