5G network slices have an important role in delivering differentiated and guaranteed connectivity services to enterprise and vertical industry tenants. With 5G network slicing getting mature in 3GPP specifications and the perspective of mass scale adoption, the management of slices including assurance (monitoring and collection, analytics, and decision) plays a key role for the distributed intelligence and closed loop assurance of 5G networks. This unlocks the key for communications service providers (CSP) to become digital service providers.
In manual or semi-automated mode of operations, centralizing service assurance and aggregating all domains and layers is necessary to provide an end-to-end service perspective, enable operations and rationalize duplicated assurance systems, optimizing IT and OSS costs. However, as the introduction of 5G slicing accelerates the move to autonomous networks, the analytics, closed-loop assurance, and intelligence capability needs to be distributed across the layers (communication service, network slice, network slice subnet and network function) and autonomous domains (RAN, transport, and core) as defined in the TMF Autonomous Network Framework, the ETSI ZSM and the 3GPP (Loops hierarchy and analytics distribution in the network with NWDAF).
With this, the role of service assurance will evolve to being even more service centric than before not only assuring end-to-end connectivity services and slice SLAs, but also assuring that the autonomous domains fulfil their intents or autonomy KPIs for the slices they are servicing, acting as a fallback option when the autonomous domain fails.
Capabilities of 5G slice assurance
Assuring 5G slices requires a set of capabilities across the distributed assurance functions. These capabilities will extend the traditional service assurance to autonomous domain assurance and infrastructure assurance.
- Service and SLA centricity: Advanced connectivity services offered through 5G network slices are expected to guarantee service level agreements (SLAs) for the connectivity characteristics, such as data speed, quality, latency, and reliability, which are either pre-defined through eMBB, uRLLC and mMTC slice types or customized per tenant using GSMA network slice templates. This lays the foundation for the assurance of diverse vertical industry tenants, such as aviation, automotive and healthcare. The need to focus on the true slice quality of service (QoS) measures and their SLAs becomes more critical across all the layers and even across CSP boundaries. Service centric assurance with active assurance is key to enabling the service and autonomous domain assurance, combining end-to-end slice or slice subnets testing with analytics for service impact and root cause analysis.
- Tenant observability: When advanced connectivity is bundled with vertical industry applications, it becomes imperative that vertical specific analytics be observed and correlated to slice analytics. An example of this is from the aviation industry, where aeroplane or passenger traffic correlates to connectivity traffic, allowing aeroplane maintenance to predict congestion and elasticity. This, in turn, drives closed-loop assurance and orchestration of the used eMBB slices considering the intent of the aviation vertical (aeroplane landing rate and maintenance work delay objectives).
- Real-time assurance to measure quality: As some of the slice types, such as uRLLC, are near real time (NRT), the frequency of measurements needs to match the latency requirements to enable fast (running within autonomous domains) and slow (running at service assurance layer) closed-loop assurance functions to react as anomalies are identified within their time constraints. This requires distributing assurance and analytics functions closer to the user plane network function, use of active testing providing NRT slice QoS measurements and could be overcome through abilities to predict problems ahead of time.
- Real-time assurance as topology changes: End-to-end connectivity service and network slice assurance need to constantly adapt to frequent autonomous network changes. Having an accurate representation of the topology is necessary to compute and validate analytics inferences for service impact and root cause analysis.
- Meeting 5G volume, scale and distribution: Because of a high level of abstractions and the explosion of vertical industries in addition to distributed and edge computing as Open-RAN adoption accelerates, the volume of network entities to be assured will increase significantly making cloud native, scalable and distributed architecture a de facto requirement for slice assurance systems.
- Cloud management and multi-layer correlations: Slice assurance needs to adapt to all heterogenous cloud deployment models: VNF or CNF based and on-prem or in the public cloud. As domains or slice subnets share cloud infrastructure, to predict service impacts and root causes (within a domain or across the domain at the service layer), assurance systems will need to perform multi-layer correlations between infrastructure, virtual resources, network function workload and services (slice subnet, slice, and communication services).
- Closed-loop automation assurance: Assurance functions will move from the current human-led healing, optimization and reporting process to slow and fast closed control loops following the observe> orient> decide> act automation model as defined in ETSI ZSM, 3GPP and TMF AN, driven by policies and intent. The traditional role of assurance functions (observe and orient) will expand to contribute largely to the decision/policy recommendations serving the intelligence capability to tightly integrate with orchestration functions. Assuring and validating the hierarchical loops will be a major shift in the assurance paradigm.
- Predictive assurance using AIOps: The use of AIOps is at the heart of most slice assurance requirements. To name a few, AIOps’ predictive assurance capability is key to anticipating issues on real-time flows and discovering knowledge (RCA rules or actual remediation policies). The assurance functions will rely on centralized training of AI models and local AI model inferencing for the specific closed-loop layer or domain AIOps use cases.
- Data lakes and APIs: To enable AIOps and to enable tenants, assurance functions have a growing need for smart data lakes to consolidate data from all sources and feed consuming assurance functions and enterprise tenants. It is important to unlock the potential of data lakes and enable openness to empower enterprise tenants while using TMF assurance APIs, such as TMF 628, TMF 642 AND TMF 621.
To conclude, assuring 5G connectivity services and network slices in diverse, autonomous, and disaggregated networks requires a new ecosystem, and new architecture blueprints needs to be created for sourcing and delivering the required assurance capabilities. The three areas of assurance, namely service assurance, autonomous domain assurance and infrastructure assurance will be delivered through independent software vendors, network equipment vendors, and cloud providers. Assuring 5G slices will require all 3 types of assurances for which collaborations should be explored to extend assurance beyond its current capabilities.
This blog was first posted on TMForum: https://inform.tmforum.org/insights/2021/06/5g-slice-assurance-in-autonomous-networks/