It is generally believed that after doing something once, it’s easier to do it again. But the converse applies in the world of network operations, where doing something once actually makes it harder to do it the next time—or more importantly, the next thousand times. Once a technology is introduced into the network, the operations teams must consider not just what the new technology does but also how it does it…and, more succinctly, how people will interact with that technology to bring services to customers.
Open Standards Enable Operations Teams to Deploy AI/ML Successfully
To deploy at scale, new technologies often need new installation methods, new monitoring systems, and new support processes; and all these require new tools and new training. This is because operations teams need to clearly understand how a technology will behave in the real world in order to successfully use it to deliver the intended business outcomes.
This is why we have standards for nearly every portion of the telecommunications network. There are IEEE standards that define how Ethernet works; ISO standards for software engineering; CableLabs standards for components of the cable telecom network; and SCTE standards for operating and maintaining the cable network. These standards not only provide assurance of interoperability, but also allow operations teams to clearly understand how the systems work so that they can build processes that ensure repeatable and consistent delivery of customer services.
How Network Operations AI/ML Can Speed MSO Fault Resolution
Artificial Intelligence and Machine Learning (AI/ML) is a relatively new discipline for telecommunications networks. Software advancements are beginning to enable operations teams to analyze massive amounts of data and quickly arrive at a conclusion. For example, a typical Tier 1 MSO is gathering several million network events at any given moment. Operations staff traditionally had to sort through these events, filter out the noise, and then work through the data until they came to a conlusion. This meant it often took hours to troubleshoot network outages and identify points of failure before a fix agent could correct the problem. This is because humans are generally bad at looking at large amounts of data and quickly filtering out what’s irrelevant.
Historically, we have tried to create software filters to lessen the amount of extraneous information, but anyone who has used inbox filters in an email client knows that the chances of missing an important message are pretty high even if you configure the filter correctly. The same holds true with network monitoring software. Static filters can quickly create a scenario where you miss something vital.
So how do we help humans whose brains are not designed to find needles in haystacks? Turns out computers are actually quite good at doing that job and, with recent advances in machine learning, software can be quite adept at sorting through large amounts of data and arriving at a conclusion.
Open Standards Help Build Transparent, Trusted AI/ML for Network Operations
AI/ML is, however, still a relatively undefined discipline for telecommunications networks. Most of the software solutions are “black boxes” providing little information and transparency about how they arrive at their conclusions. This lack of transparency immediately creates a lack of trust in the minds of operations teams who are accountable for the health of the networks they maintain. To become accepted in operations, AI/ML needs to be transparent, explainable, and trusted. Imagine if your doctor used an AI/ML algorithm to diagnose your affliction and suggest a course of treatment. You and your doctor would want to know how that system arrived at its conclusion before moving forward. Similarly, an ideal AI/ML engine used in network operations would offer not only a recommendation, but also an explanation as to how it reached that conclusion, so that the operations humans have a basis for trusting the system.
Additionally, since the AI/ML discipline is relatively new, it is also not well or deeply understood, and implementations do not have a long history of deployment. This means that AI/ML might be implemented with the best intentions but produce unexpected results. In the documentary “The Social Dilemma,” several designers of the algorithms that drive social media and search engines discuss how their AI/ML produced unexpected results. Intended to drive human engagement and participation, the algorithms also ended up driving the spread of misinformation and even began to reshape how people thought and behaved. Yet the likely intent of the AI/ML designers was not to promulgate false information, and the algorithm trying to drive engagement in humans was an unexpected result. Unexpected results are something that network operations teams are vigorously trying to avoid. An AI/ML that is not transparent or understandable increases the probability of unexpected results and decreases trust in the system….and untrusted systems do not get adopted by operations: at least not willingly.
Working with AI/ML Open Standards Teams in MSO Industry Organizations
As stated previously, open standards help create consistency and trust in the operations of a system. This is why, in May 2020, the SCTE formed an AI/ML standards team to begin defining use cases, best practices, and eventually standards for using AI/ML in the cable network. The standards team is divided into four working groups, each focused on a specific area of AI/ML use cases for cable/MSO networks. These are Data Governance, Anomaly Detection, Digital Piracy and Automated Ad analysis.
Fujitsu has been leading and contributing to the Anomaly Detection working group since August 2020. We are working closely with cable operators and other industry vendors to identify AI/ML use cases for detecting anomalous behaviors in the cable network. We are also working to document how AI/ML can resolve these use cases in a clear, consistent and repeatable manner that empowers operations teams to deliver better business outcomes. Example use cases for anomaly detection include LTE call setup anomalies; set-top box and cable modem anomalies; and optical network anomalies. We can apply AI/ML to analyze large amounts of data not just to correlate faults but also to predict where failures may soon occur in the network, allowing the operator to take preemptive action. Few things will drive better customer experience scores and reduce incoming service calls and truck rolls like identifying a network issue and preemptively initiating a repair. Cox Communications presented a paper at Cable-Tec Expo 2019 that described using AI/ML to do just that in their DOCSIS network in a few test markets. The resulting impact to inbound calls, truck rolls and customer satisfaction was so immediate and clear that they stated they would immediately roll this out to all markets.
Building Understanding and Trust in AI/ML and Network Automation
It is clear that AI/ML has great potential for improving MSO network operations overall, but in order for it to be adopted, it must be well understood and trusted. This is the primary goal of the anomaly detection working group. This group seeks to identify cable-centric use cases for AI/ML, as well as to document best practices for applying AI/ML to those use cases. We need AI/ML that is transparent, trusted, and explainable in order for it to be successful and avoid unexpected outcomes.
By developing solid best practices and, eventually, standards, the SCTE AI/ML working group will contribute greatly to this effort. If you are interested in participating, you can join the working group and contribute to this effort at http://standards.scte.org
For more information about Fujitsu network digital transformation solutions, including network automation and AI/ML, visit our virtual booth at Cable-Tec 2020. Registration is free.