Overview of ERPS


NOTEs:

Currently, the following features defined in ERPSv2 are not supported in PICOS:

  • Sub-ring
  • Configuration of revertive and non-revertive mode
  • Configuration of WTB timer

Redundant links (e.g, ring networks) are often used in Ethernet switching networks for link backup and to improve network reliability. However, redundant links always form loops in the network, which may cause broadcast storms and unstable MAC address tables, and result in undesirable network communication interruption for users.

Ethernet Ring Protection Switching (ERPS) is a layer 2 ring protection protocol standard defined by ITU-T under the standard number ITU-T G.8032/Y1344. It defines the Ring Auto Protection Switching (RAPS) protocol message and ring protection mechanism.

ERPS leverages the advantages of ring protection technologies such as STP, with optimized detection mechanism and faster convergence; it also has good compatibility to allow interoperability with the switches (also support ERPS protocol) of other manufacturers within the ring.

Terminology

Ring

An ERPS ring is a group of interconnected Layer 2 switching devices configured with the same control VLAN and is the basic unit of the ERPS protocol. As shown in Figure 1, Switch A, Switch B, Switch C and Switch D form an ERPS Ring. The ring ID is a unique identifier for a ring which can be configured by command set protocols erps ring <ring-id>.

Figure 1. ERPS Diagram

Ring Instance

The ERPS ring is a physical ring, and the ring instance can be understood as a logical ring on the physical ring. A maximum of eight ERPS rings are supported on a device, and a maximum of two instances can be configured for each ring.

To improve link utilization, ERPS supports the configuration of up to two logical ERPS rings on a single physical ring, i.e., two instances. Usually, different instances should have unique configurations such as port roles, control VLANs, etc. Each instance has its own blocking port, which will be blocked or unblocked separately, without affecting each other. Topology calculation for different instances will not affect each other.

For example, different protected instances can be configured for different ring instances, data traffic belonging to different VLANs can then be transmitted through different paths, thus achieving load sharing and link backup of traffic and maximizing the utilization of link resources.

Node

The Layer 2 switching devices that join the ERPS ring are called ERPS nodes. No more than two ports per node can join the same ERPS ring. As shown in Figure 1, Switch A, Switch B, Switch C and Switch D are the nodes of the ERPS ring.

Port Role

There are three types of ERPS port role: RPL Owner Port, RPL Neighbor Port and Ordinary Port.

  • RPL Owner Port

Each ERPS ring instance has only one RPL owner port, which is determined by user configuration. By blocking the RPL owner port to forward user traffic, it prevents network loops in the ERPS ring. RPL owner port state is discarding, it can only send and receive ERPS protocol packets.

When the device of the RPL owner port receives a link failure message and learns that other nodes or links on the ERPS ring are down, it will automatically unblock the RPL owner port. RPL owner port state changes to forwarding, which will resume receiving and sending traffic, ensuring that traffic will not be interrupted.

The link where the RPL owner port located is the Ring Protection Link (RPL), which is normally NOT allowing traffic except ERPS protocol packets to pass.

  • RPL Neighbor Port

RPL neighbor port refers to the port on the RPL link that is directly connected to the RPL owner port. It needs to be specified by the user configuration.

In normal condition, both the RPL owner port and the RPL neighbor port are blocked to prevent network loop.

When a link failure occurs, both the RPL owner port and the RPL neighbor port will be unlocked.

  • Ordinary Port

The ERPS port is an ordinary port if not specified as an RPL owner port or RPL neighbor port.

The ordinary ports are responsible for monitoring the link status of the ERPS ring and informing other ERPS nodes.

Control VLAN

In an ERPS ring, the control VLAN is used to transmit ERPS protocol packets.

  • Each ERPS ring instance must be configured with a control VLAN.
  • Different ERPS ring instances cannot use the same control VLAN.
  • The same control VLAN must be configured for all devices in the same ERPS ring instance.

Data VLAN

Data VLAN are a group of VLAN(s) that is used in the ring for transmission of user traffic, which is defined in MSTP instance and VLAN mapping.

Protected Instance

Protected instance is the MSTP instance of data VLAN mapping that needs the ERPS ring protection. Before configuring ERPS, users need to configure protected MSTP instance and VLAN mapping by using command set protocols spanning-tree mstp msti <msti> vlan <vlan-id>. Then, configure the MSTP instance as the protected-instance of the ERPS ring instance by using command set protocols erps ring <ring-id> instance <instance-id> protected-instance <msti>.

The control VLAN and data VLAN must be configured in the protected instance, only then the ERPS protocol can process messages of these VLANs.

ERPS Timer

There are three timers used in ERPS protocol: Guard Timer, WTR (Wait to Restore) Timer and Holdoff Timer.

  • Guard Timer

The device involved in the signal failure (SF) sends R-APS (NR) messages to other nodes after the failure is recovered or the clearing of SF condition operation is detected. The guard timer is started at the same time, and the R-APS (NR) messages are not processed until this timer expires, with the purpose of preventing the reception of outdated R-APS (NR) messages. If R-APS (NR) messages from other ports are still received after the guard timer expired, the state of this port is changed to Forwarding state.

The guard timer can be configured. The default time interval is 500 milliseconds; the time interval ranges from 10 to 2000 milliseconds.

  • Wait-to-Restore (WTR) Timer

When recovering from a signal failure (SF) condition, the WTR timer is used to prevent frequent operation of protection switching due to intermittent SF defects.

The WTR timer can be configured. The delay timer must be long enough to allow the recovering network to become stable. The default time interval is 5 minutes; the time interval ranges from 1 to 12 minutes.

  • Holdoff Timer

If the holdoff timer is specified, a defect is not reported to the ring protection mechanism immediately. Instead, the hold-off timer is started. On expiration of the timer, if the defect still exists, it is reported to protection switching.

The holdoff timer can be configured. The default time interval is 0 millisecond; the time interval ranges from 0 to 1000 milliseconds.

ERPS Operation Mechanism

In normal condition, the communication is normal between devices on the loop, all ports can forward traffic normally except the RPL owner port and RPL neighbor port are blocked by ERPS to prevent network loops. When link failure occurs, there are two ERPS operation process involved: Link Failure and Recovery of Link Failure, which are described below.

Link Failure

As shown in Figure 2, when the link between Switch B and Switch C fails, the ERPS protocol initiates a protection reversal mechanism to block the ports at both ends of the failed link, and then unblocks the RPL owner port, which resumes receiving and sending user traffic, thus ensuring uninterrupted traffic. The detailed process is as follows:

  1. Switch B and Switch C detect a link failure, block the port on the failed link, and perform the FDB flush.
  2. Switch B and Switch C then start sending R-APS (SF) messages periodically with the (node ID, BPR) pair on both ring ports, while the SF condition persists.
  3. When other devices receive the R-APS (SF) message from Switch B and Switch C, they all perform FDB flush. When Switch A  (the device where the RPL owner port is located) receives this RAPS message, it unblocks the RPL owner port, Switch E unblocks RPL neighbor port, and perform FDB flush.

Figure 2. ERPS Link Failure

Recovery of Link Failure

After the link has been recovered from the failure, it can be used to transmit user traffic again, the RPL owner port and neighbor port will be blocked again. The detailed recovery process is as follows:

  1. When the link between Switch B and Switch C is recovered, Switch B and Switch C start a Guard Timer to prevent reception of outdated R-APS (NR) messages, and do not receive other R-APS protocol messages until the timer expires. At the same time Switch B and Switch C sends R-APS (NR) messages to other nodes.
  2. When Switch A (the device where the RPL owner port is located) receives the R-APS (NR) message, it starts the WTR Timer. On expiration of the WTR timer, the RPL owner node blocks its end of the RPL, sends an R-APS (NR, RB) message with the (node ID, BPR) pair and performs the FDB flush.
  3. When Switch B and Switch C receive the R-APS (NR, RB) message from Switch A, they remove the block on its blocked ring ports, stop sending R-APS (NR) messages and perform the FDB flush. In addition to this, ethernet ring nodes B to E perform the FDB flush when receiving an R-APS (NR, RB) message due to the node ID and BPR-based mechanism.



Copyright © 2024 Pica8 Inc. All Rights Reserved.