Overview of ERPS
Redundant links (e.g., ring networks) are often used in Ethernet switching networks for link backup and to improve network reliability. However, redundant links always form loops in the network, which may cause broadcast storms and unstable MAC address tables, and result in undesirable network communication interruption for users.
Figure 1. ERPS Diagram
As shown in Figure 1, Ethernet Ring Protection Switching (ERPS) is a layer 2 ring protection protocol standard defined by ITU-T under the standard number ITU-T G.8032/Y1344. It defines the Ring Auto Protection Switching (RAPS) protocol message and ring protection mechanism.
PICOS supports both versions of ERPSv1 and ERPSv2. ERPSv2 is fully compatible with ERPSv1 and has the following extensions:
- Sub-ring supporting
- Sub-ring virtual channel/non-virtual channel transmission of RAPS messages
- Manual switching of port blocking, including Forced Switch and Manual Switch
- ERPS ring revertive/non-revertive modes are configurable
- Sub-ring topology change notifications
ERPSv1 only supports major ring networking, while ERPSv2 supports not only major ring, but also supports sub-ring network, and the mixed networking of major ring and sub-ring.
ERPS leverages the advantages of ring protection technologies such as STP, with optimized detection mechanism and faster convergence; it also has good compatibility to allow interoperability with the switches, also support ERPS protocol, of other manufacturers within the ring.
Terminology
Ring, Major Ring and Sub-ring
A group of interconnected Layer 2 switching devices configured with the same control VLAN and is the basic unit of the ERPS protocol forms an ERPS ring.
An ERPS ring can be a major ring or a sub-ring. By default, an ERPS ring is a major ring. The major ring is a closed ring, and the sub-ring is a non-closed ring which needs to be defined by CLI command set protocols erps ring <ring-id> sub-ring <true | false>. The configuration of the sub-ring is supported only in ERPSv2 version.
The Ethernet ring control module supports multiple rings in each node (two interfaces are part of each ring). The ring control module also supports the intersection of multiple rings. Intersection of two rings means that two rings might share the same link or share the same node.
As shown in Figure 2, Switch A, Switch B, Switch C and Switch D form an ERPS major ring, Switch B, Switch C and Switch E form an ERPS sub-ring. The ring ID is a unique identifier for each physical ring which can be configured by command set protocols erps ring <ring-id>.
Figure 2. ERPS Major Ring and Sub-ring
The protocol messages of the major ring are transmitted only on the major ring, and the protocol messages of the sub-ring will terminate at the intersecting nodes and will not enter the major ring. However, when there is a link fault in the sub-ring, it is necessary to advertise the topology change information of the sub-ring to the major ring at the intersecting nodes Switch B and Switch C, which is fulfilled by the tcn-propagation function.
Users can plan and deploy ERPS major ring and sub-ring by accordance with the actual network topology and usage environment.
Ring Instance
The ERPS ring is a physical ring, and the ring instance can be understood as a logical ring on the physical ring. A maximum of eight ERPS rings (including both major ring and sub-ring) are supported on a device, and a maximum of two instances can be configured for each ring. Additionally, at least one instance should be configured for each ring.
To improve link utilization, ERPS supports the configuration of up to two logical ERPS rings on a single physical ring, i.e., two instances. Usually, different instances should have unique configurations such as port roles, control VLANs, etc. Each instance has its own blocking port, which will be blocked or unblocked separately, without affecting other instances. Topology calculation for different instances will not affect each other.
For example, different protected instances can be configured for different ring instances, data traffic belonging to different VLANs can then be transmitted through different paths, thus achieving load sharing and link backup of traffic and maximizing the utilization of link resources.
Node
The Layer 2 switching devices that join the ERPS ring are called ERPS nodes. No more than two ports per node can join the same ERPS ring, which are named ERPS port0 and port1.As shown in Figure 2, Switch A, Switch B Switch C and Switch D are the nodes of the major ring, Switch B, Switch C and Switch E are the nodes of the sub-ring.
Port Role
There are three types of ERPS port role: RPL Owner Port, RPL Neighbor Port and Ordinary Port.
- RPL Owner Port
Each ERPS ring instance has only one RPL owner port, which is determined by user configuration. By blocking the RPL owner port to forward user traffic, it prevents network loops in the ERPS ring. RPL owner port state is discarding, it can only send and receive ERPS protocol packets.
When the device of the RPL owner port receives a link failure message and learns that other nodes or links on the ERPS ring are down, it will automatically unblock the RPL owner port. RPL owner port state changes to forwarding, which will resume receiving and sending traffic, ensuring that traffic will not be interrupted.
The link where the RPL owner port located is the Ring Protection Link (RPL), which is normally NOT allowing traffic to pass.
- RPL Neighbor Port
RPL neighbor port refers to the port on the RPL link that is directly connected to the RPL owner port. It needs to be specified by the user configuration.
In normal condition, both the RPL owner port and the RPL neighbor port are blocked to prevent network loop.
When a link failure occurs, both the RPL owner port and the RPL neighbor port will be unlocked.
- Ordinary Port
The ERPS port is an ordinary port if not specified as an RPL owner port or RPL neighbor port.
The ordinary ports are responsible for monitoring the link status of the ERPS ring and informing other ring nodes.
Control VLAN
In an ERPS ring, the control VLAN is used to transmit ERPS protocol packets.
- Each ERPS ring instance must be configured with a control VLAN.
- Different ERPS ring instances cannot use the same control VLAN.
- The same control VLAN must be configured for all devices in the same ERPS ring instance.
Data VLAN
Data VLANs are a group of VLAN(s) that is used in the ring for transmission of user traffic, which should be defined in MSTP instance and VLAN mapping.
Protected Instance
Protected instance is the MSTP instance of data VLAN mapping that needs the ERPS ring protection. Before configuring ERPS, users need to configure protected MSTP instance and VLAN mapping by using command set protocols spanning-tree mstp msti <msti> vlan <vlan-id>. Then, configure the MSTP instance as the protected-instance of the ERPS ring instance by using command set protocols erps ring <ring-id> instance <instance-id> protected-instance <msti>.
The control VLAN and data VLAN must be configured in the protected instance, only then the ERPS protocol can process messages of these VLANs.
NOTE:
When users want to share the forwarding path of data traffic between intersecting rings, it is important to note that when configuring ERPS, the control VLAN and data VLAN of the intersecting rings should be configured in the same protected instance.
For example, let’s assume that two intersecting rings with different control VLANs (such as VLAN 200, VLAN 300), and they share the same data VLAN (such as VLAN 100), it's necessary to configure the control VLANs and data VLAN (VLAN 100, VLAN 200, VLAN 300) into the same protected instance to enable data forwarding between the intersecting rings.
You can refer to Example for Configuring ERPS (Intersection Rings) for detailed configuration.
ERPS Timer
There are three timers used in ERPS protocol: Guard Timer, WTR (Wait to Restore) Timer and Holdoff Timer.
- Guard Timer
The device involved in the signal failure (SF) sends R-APS (NR) messages to other nodes after the failure is recovered or the clearing of SF condition operation is detected. The guard timer is started at the same time, and the R-APS (NR) messages are not processed until this timer expires, with the purpose of preventing the reception of outdated R-APS (NR) messages. If R-APS (NR) messages from other ports are still received after the guard timer expired, the state of this port is changed to Forwarding state.
The guard timer can be configured. The default time interval is 500 milliseconds; the time interval ranges from 10 to 2000 milliseconds.
- Wait-to-Restore (WTR) Timer
When recovering from a signal failure (SF) condition, the WTR timer is used to prevent frequent operation of protection switching due to intermittent SF defects.
The WTR timer can be configured. The delay timer must be long enough to allow the recovering network to become stable. The default time interval is 5 minutes; the time interval ranges from 1 to 12 minutes.
- Holdoff Timer
If the holdoff timer is specified, a defect is not reported to the ring protection mechanism immediately. Instead, the hold-off timer is started. On expiration of the timer, if the defect still exists, it is reported to protection switching.
The holdoff timer can be configured. The default time interval is 0 millisecond; the time interval ranges from 0 to 1000 milliseconds.
- WTB timer
When clearing the forced switching or manual switching state of a port, WTB timer is started. As there may be multiple manually switching blocking nodes in a ERPS ring, and the clearing operation works only when the WTB timer expires, which prevents blocking-point oscillations caused by immediate blocking of the RPL owner port.
WTB Timer cannot be configured directly via CLI, it obtains from the configuration of Guard Timer value plus 5 seconds. The default value is 7 seconds.
Revertive/Non-revertive Mode
ERPS revertive/non-revertive mode determines whether the RPL owner port is re-blocked when the failed link recovers.
- In revertive mode, if the failed link recovers, the RPL owner port is re-blocked after waiting for the WTR timer interval. The RPL link is reverted to
- In non-revertive mode, if the failed link recovers, the WTR timer is not started, and the blocking link remains on the original failed link and does not revert to RPL link.
By default, the ERPS ring is in revertive mode.
ERPSv2 supports the configuration of revertive and non-revertive modes, while ERPSv1 supports only revertive mode.
Port Blocking Switching Method
Since the RPL link may have higher bandwidth, users can consider blocking the link with low bandwidth to allow data traffic to transmit through the RPL link.
ERPS supports two switching methods to manually configure port blocking: Force Switch and Manual Switch.
- Forced Switch: Ports configured for forced switch are blocked immediately, regardless of whether other links on the ring are faulty or not.
- Manual Switch: Ports configured for manual switch are blocked if the state of the ring is Idle or Pending, otherwise they are not blocked.
In addition to forced switch and manual switch, ERPS also supports clear operation, which is used in the following three cases:
- Clearing the locally configured manual switch and forced switch configurations.
- When the ERPS ring is in the revertive mode, the revert action is triggered manually before the WTB Timer or WTR Timer expires.
- When the ERPS ring is in non-revertive mode, the revert action is triggered manually.
Users can use the run show erps ring <ring-id> [instance <instance-id>] command to view the detailed information of the ERPS ring instances. If force switch (or manual switch) is set successfully, the Node state in the show result displays Forced Switch (or Manual Switch).
admin@PICOS# run show erps ring 1 Ring ID: 1 Port0: te-1/1/19 Port1: te-1/1/7 Ring-MAC: false Sub-ring: No Virtual-channel: No Instance ID: 1 Enable: Yes Active: true Node state: Forced Switch Description: Control VLAN: 4001 Protected instance: 1 Protected VLAN: 100-101,111,4094 ……
Port blocking manual switching is an ERPSv2 feature which is not supported in ERPSv1.
Virtual-Channel Sub-ring RAPS Message Transmission Method
When ERPS protocol is deployed in multi-ring networking, the transmission methods of RAPS messages on sub-ring nodes are categorized into Virtual-Channel (VC) and Non-Virtual-Channel (NVC).
For Virtual-Channel method, the RAPS protocol messages of the sub-ring will transmit in the major ring through the intersecting node. That is, the intersecting node does not terminate the protocol messages of the sub-ring. In this topology, the RPL owner port of the sub-ring blocks both RAPS protocol messages and data traffic of the sub-ring.
R-APS messages from sub-rings are forwarded over virtual channels to be broadcast or multicast over the interconnected network.
R-APS messages forwarded by a sub-ring over a virtual channel need to be distinguishable from R-APS messages of other rings, which can be achieved by using separate control VLANs for the R-APS virtual channels of different sub-rings.
To enable Virtual-Channel method, the following configurations are required:
1. Enable virtual channel on all the devices of the sub-ring, where the <ring-id> is the sub-ring ID.
set protocols erps ring <ring-id> virtual-channel <true | false>
2. Add the ports of major ring, which are used for forwarding R-APS messages from the sub-ring, to the control VLAN of the sub-ring.
Non-Virtual-Channel Sub-ring RAPS Message Transmission Method
For Non-Virtual-Channel method, the RAPS protocol messages of the sub-ring will terminate on the intersecting node, and the RPL owner port of the sub-ring will block only the data traffic but not the RAPS protocol messages of the sub-ring.
If link failure occurs on any link of the sub-ring, the RAPS channel of the sub-ring may be segmented, which prevents RAPS messages from being exchanged between sub-ring links.
By default, the sub-ring RAPS message is transmitted with non-virtual channel method.
Connect Ring for Sub-ring
On a multi-ring network, associate a ring (this is the connect ring, e.g. ring A) with a sub-ring (e.g. sub-ring B) if you want to advertise topology changes in the sub-ring (sub-ring B) to the ring (ring A).
Before using the tcn-propagation function to forward a topology change notification to the connect ring whenever the topology of the sub-ring changed, you need to configure connect ring for the sub-ring. By default, a sub-ring has no connect ring.
Tcn-propagation
Tcn-propagation (topology change notification propagation) function enables sub-ring topology change notifications. When the topology of the sub-ring changes, the FDB refresh of the port will generate a Topology Change (TC) signal. When the tcn-propagation function is enabled, the intersecting node sends an event flush message to the connect ring when it receives the TC signal.
By default, this feature is disabled.
To enable sub-ring topology change notifications, the following configurations are required:
1. Configure the connect ring for a sub-ring on all intersecting nodes.
set protocols erps ring <ring-id> instance <instance-id> connect ring <ring-id> instance <instance-id>
2. Enable tcn-propagation function to advertise topology changes to the connect ring on all intersecting nodes.
set protocols erps tcn-propagation <true | false>
ERPS Operation Mechanism
In normal condition, the communication is normal between devices on the loop, all ports can forward traffic normally except the RPL owner port is blocked by ERPS to prevent network loops. When link failure occurs, there are two ERPS operation process involved: Link Failure and Recovery of Link Failure, which are described below.
Link Failure
As shown in Figure 3, when the link between Switch B and Switch C fails, the ERPS protocol initiates a protection reversal mechanism to block the ports at both ends of the failed link, and then unblocks the RPL owner port, which resumes receiving and sending user traffic, thus ensuring uninterrupted traffic. The detailed process is as follows:
- Switch B and Switch C detect a link failure, block the port on the failed link, and perform the FDB flush.
- Switch B and Switch C then start sending R-APS (SF) messages periodically with the (node ID, BPR) pair on both ring ports, while the SF condition persists.
- When the other devices receive the R-APS (SF) message from Switch B and Switch C, they all perform the FDB flush. When Switch A device (the device where the RPL owner port is located) receives this RAPS message, it unblocks the RPL owner port and RPL neighbor port, performs the FDB flush.
Figure 3. ERPS Link Failure
Recovery of Link Failure
After the link has been recovered from the failure, it can be used to transmit user traffic again, and the RPL owner port will be blocked again. The detailed recovery process is as follows:
- When the link between Switch B and Switch C is recovered, Switch B and Switch C start a Guard Timer to prevent reception of outdated R-APS (NR) messages, and do not receive other R-APS protocol messages until the timer expires. At the same time Switch B and Switch C sends R-APS (NR) messages to other nodes.
- When Switch A (the device where the RPL owner port is located) receives the R-APS (NR) message, it starts the WTR Timer. On expiration of the WTR timer, the RPL owner node blocks its end of the RPL, sends an R-APS (NR, RB) message with the (node ID, BPR) pair and performs the FDB flush.
- When Switch B and Switch C receive the NR-RB (No request, RPL block) R-APS message from Switch A, they remove the block on its blocked ring ports, stop sending R-APS (NR) messages and perform the FDB flush. In addition to this, ethernet ring nodes B to E perform the FDB flush when receiving an R-APS (NR, RB) message due to the node ID and BPR-based mechanism.
Copyright © 2024 Pica8 Inc. All Rights Reserved.