Share Via
Introduction
The idea of a golden configuration for enterprise wired and wireless LANs is a legacy practice very similar to the idea of expecting the RMA of a network element in the event an issue is encountered. Oftentimes, a firmware change may be all that is needed.
I’ll be going into this in more detail in this blog as AI and the ability to trigger remedies is providing us with a faster and better way to diagnose the root cause of presumed hardware issues, without the RMA, waiting, downtime and IT resources burden. If you would like to catch the demo before we dive deeper, here it is:
When unexpected network incidents such as CPU spikes occur IT administrators are often forced to bring down the network element until a replacement is shipped and installed. On occasion the next firmware version may be the needed fix, but replacing the hardware often seems to be the fastest solution. If the replacement exhibits the same problem, they’re forced to wait on the new firmware.
During this process they run sanity checks with scripts and discover regression issues. They go through multiple iterations of this process until they gain confidence – however shaky it may be – they lose valuable hours of productivity.
Nile’s Closed Loop Automation Advantage
Nile Services Cloud powers the following five groups of unique AI Networking capabilities to make up closed loop automation capabilities within the Nile Access Service.
- Design Pipeline
- Digital Twin
- Defense Hub
- Smart Agents
- Cognitive Decisions
Each plays a role in helping us capture data, recognizing network or security issues and either resolving a problem or optimizing our service.
Nile’s AI in Action
In the example shown in Fig. 1, models in the Cognitive Decisions module detected an unusual spike in memory errors in one of the access points. If this condition is left unaddressed the traffic will be blackholed until manual intervention fixes the problem. This is not ideal as the site may be remote and will require traveling to the location, etc.
Fig 1: Cognitive Decisions Update Digital Twin with Memory Errors
Cognitive Decisions Takes Remedial Actions
The Cognitive Decision module immediately takes the remedial action by automatically restarting the access points, thus ensuring that the network continues to function until the long term remedy is issued.
The AI in the Cognitive Decision module also searches the existing database of issues that match the memory error seen in the access points. Upon finding the matched issue, it raises an alert to production network engineers (PNEs) describing the incident and attaching a description of the matching issue.
Cognitive Decisions Alerts Stakeholders
The Cognitive Decisions module alerts the Nile customer success manager (CSM) responsible for helping the customer where the access point is deployed. with the following information:
- Conditional Attributes
- Access point details
- Detected anomaly
- Software issue causing the anomaly
- The Remedy
- Immediate remedial action required. For instance, an AP firmware upgrade is needed to address the issue
Cognitive Decisions Initiates Long Term Remedy
During this process a maintenance window is created in collaboration with customers and Nile customer success managers (CSMs), during which the access point’s software can be upgraded.
Upon successful completion of a firmware upgrade, Smart Agents capability within the Nile Services Cloud analyzes the data about the updated access points against the expected baseline performance metrics within the Digital Twin. Cognitive Decisions processes the data and updates the Digital Twin, thereby ensuring that the firmware information about the access point reflects the upgraded version.
This closed loop approach to network operations completely automates the monitoring of network incidents and their resolution, thus leading to lower MTTR and a better user experience. No more waiting for the right firmware version with the right patch, testing through different combinations, addressing regression issues and lost productivity.
Conclusion
Nile’s ability to validate real-time performance metrics of network software functions against a Digital Twin, towards proactive detection and resolution to guarantee performance outcome, is unique in the industry. Within the example provided in this blog, Nile’s AI networking approach targets the idea of getting us beyond expecting RMAs for every little issue, the ability to leverage AI to look at repetitive TAC cases, and creating efficiencies. The following image outlines each step in the process at a high level and how different modules within the Nile Services Cloud interact.