Principle of MLAG
Definition
MLAG (Multi-chassis Link Aggregation) as the name suggests, allows different member ports of a lag interface to be deployed on two different devices which appear to be on a single device. The two peer devices maintain communication by exchanging hello packets and MAC address learning of the lag interface to ensure MAC synchronization using L2 multicast packets with the destination address 01:80:c2:00:00:0f. A downstream switch or host of the other end of the LAG link doesn’t get a feel that there are two devices linked with it on the other side of the link. The figure below shows the basic MLAG networking example.
Figure 1 MLAG Networking
MLAG is mainly applied in scenarios where a downstream switch or host has or needs dual-access to the network. In Figure 1, without deploying MLAG, SwitchB can only connect to SwitchA1 using an LACP link. If the LACP link or SwitchA1 fails, SwitchB cannot communicate with the network. By using MLAG, the downstream switch or host can have dual-access to the network, enabling link and device-level redundancy and protection.
This provides redundancy by giving the downstream switch or host two uplink paths as well as full bandwidth utilization since the MLAG domain appears to be a single switch to Spanning Tree Protocol (STP). Because the MLAG domain appears to STP as a single switch there are no blocked ports.
MLAG has the following advantages:
• Increased bandwidth
MLAG aggregates multiple Ethernet ports across two switches, this increases the uplink bandwidth. The maximum bandwidth of the link aggregation interface can reach the sum of the bandwidth of each MLAG member port.
• Higher reliability
Dual-working mechanism to ensure high reliability. When a link or device fails, traffic can be switched to other available member links or device to improve the reliability of the MLAG domain.
• Load balancing
In an MLAG domain, you can achieve load balancing on each active aggregation interface link.
Basic Concepts
• MLAG domain
MLAG is distinguished by the MLAG domain on the MLAG peer device, each MLAG belonging to one MLAG domain. The MLAG domain maintains the configuration information and status of the MLAG local and the MLAG peer device.
The MLAG domain ID is the unique identifier for an MLAG domain, and different MLAGs require different MLAG domain IDs. The MLAG domain ID must be identical on each switch to facilitate MLAG communication.
Figure 2 shows multiple MLAGs networking, where Switch1, Switch2 and the member ports connected to Switch3 form an MLAG, MLAG Domain 1; Switch1, Switch2 and the member ports connected to Switch4 form another MLAG, MLAG Domain 2.
Figure 2. Multiple MLAGs Networking
• MLAG peer
MLAG peer devices, a pair of switches that enables the MLAG function.
• MLAG member port
MLAG member ports are the ports on the MLAG peer devices which comprises the MLAG, that is, the ports interconnecting the downstream device.
• MLAG peer link
MLAG peer link is the link between MLAG peer devices, used for transmitting MLAG peer link information, such as MLAG local information and peer information, MAC synchronization messages, LACPDU and STP packets of MLAG.
Negotiation of Master and Slave
MLAG defines Master and Slave based on system priority (configurable) and system MAC, these two parameters are transmitted through hello packets between the MLAG peer devices.
System priority will be compared first, a smaller system priority value indicates a higher priority, device with a higher priority is the Master, the device in MLAG peer is the Slave. If the system priorities are the same, then the system MAC is compared, device with a smaller system MAC becomes the Master, and the MLAG peer device becomes the Slave.
NOTE: Master and Slave are defined for each MLAG domain. You can configure different MLAG on the same device of MLAG peer, so the device may be Master or Slave for different MLAGs. |
Hello Packets
Hello packets are sent periodically through the layer-3 network, for discovering and maintaining neighbor relationships for MLAG. The main parameters in a hello packet are: domain ID, system MAC, system priority, MLAG interface state, MLAG peer specified IP, MLAG peer system priority, MLAG peer system MAC, and MLAG peer LAG ID.
The main functions of hello packets are:
• Sending heartbeat messages between the two MLAG peer-link devices periodically(4s by default), if one of the MLAG peer devices or peer link fails, the MLAG peer device on the other side of the MLAG peer link senses the failure by not receiving any heartbeat message after the timeout.
• Transmitting system priority and system MAC to the MLAG peer for Master Slave negotiation.
• Transmitting interface state to the MLAG peer for running MLAG state machine.
• Transmitting MLAG peer system MAC, MLAG peer specified IP, and MLAG peer LAG ID to the MLAG peer for checking the information of peer device.
• Transmitting MLAG system ID to the MLAG peer for setting LACP MAC when needed.
MLAG State
The system defines MLAG interface state and MLAG neighbor state to facilitate link fault detection and recovery. You can view the MLAG interface state and MLAG neighbor state by using the relevant show commands.
MLAG Interface State
MLAG interface state defines the status of peer session and MLAG member ports, including the following types:
• ASY_PEER: Peer session is established, the local MLAG member port is down and peer MLAG member port is up.
• ASY_LOCAL: Peer session is established, the local MLAG member port is up and peer MLAG member port is down.
• STANDBY: Peer session is not established, and the local MLAG member port is up.
• DOWN: There are two cases,
- Peer session is established and MLAG member ports on both sides of MLAG peer device are down.
- Peer session is not established and local MLAG member port is down.
• FULL: Peer session is established and MLAG member port on both sides of MLAG peer device are up.
You can use the run show mlag internal command to view the status of MLAG interface. For example,
root@Xorplus# run show mlag internal
Domain-id Local-LAG Flood MAC-sync State Config-Match Role
-----------------------------------------------------------------------------------------------------
2 ae3 false true FULL Yes SLAVE
1 ae1 false false ASY_PEER Yes MASTER
MLAG Neighbor State
MLAG neighbor state defines the status of MLAG peer device and peer session, including the following types:
• ONE-WAY:The switch has received a hello packet from the MLAG peer device, but the hello packet does not contain the local device information, indicating that the peer session is not established.
• TWO-WAY:Both sides of the MLAG peer device know each other, peer session is established, but the peer MLAG member port is down.
• FULL:Both sides of the MLAG peer device know each other, and MLAG member ports on both sides of MLAG peer device are up.
You can use the run show mlag peer domain-id command to view the status of MLAG peer device of an MLAG domain. For example,
root@Xorplus# run show mlag peer 1
Domain-id Peer System-mac State Link-status
-----------------------------------------------------------------------------------
1 10.1.1.1 08:9e:01:53:78:dc TWO-WAY DOWN
Reload Delay
If the local device of MLAG peer is the master, the slave device becomes master when the peer-link goes down due to the local device or local MLAG member port failure. If system ID is not configured, the slave will use its own device MAC (instead of the original master device MAC) for LACP negotiation with the downstream device. The change of MAC address results in communication interruption with the downstream device affecting the normal service delivery.
In order to reduce this impact of network flapping, PicOS supports reload delay function. After the local device fails, if same MLAG system ID is configured on both sides of MLAG peer-link, the access switch will not be able to sense peer-link down. So the traffic will go through both links on Master and Slave. However, if MLAG system ID is not configured, the peer device considers the peer-link down if it does not receive any hello packets for a period of 3 times the hello-interval. Then the reload-delay timer will be started if it’s not equal to zero. Before the reload-delay timer expires, the MAC address of slave device won’t be changed, thus the link of the downstream device to slave will communicate normally, thus ensures the normal service delivery.
If the local device or local MLAG member port recovers before the reload-delay timer expired, then the local device changes back to master after the reload-delay time. If the local device or local MLAG member port is still down after the reload-delay timer has expired, the slave device will use its own device MAC for LACP negotiation with the downstream device, this will take about 1.5 minutes before communication with the downstream device is established.
MAC Synchronization between MLAG Member Ports
In order to ensure that the traffic of the same user can be forwarded normally at both ends of the MLAG peer device, the MAC address learning on the MLAG member ports on both peer devices needs to be consistent with each other. MAC synchronization message is transferred by L2 multicast packet with destination address 01: 80: c: 00: 00: 0f.
Meanwhile, in order to control bandwidth consumption of the MLAG peer link caused by flooding of unknown unicast traffic from the downstream switches, the MLAG peer switches should synchronize MAC address from each other.
If the following two conditions are satisfied, the MLAG peer device will check the local MAC address table for MAC synchronization.
- MLAG interface state on both of the peer-link devices are FULL.
- MLAG peer-link interface on both of the peer-link devices are up.
If the MAC address table is updated, the MAC synchronization message will be sent. The message includes MAC addresses and VLAN IDs of the MLAG interface in the MAC address table. After receiving the MAC synchronization message, the peer device updates its MAC address table accordingly to synchronize with the peer device.
If the MAC sync condition is not satisfied or receiving MAC entry delete message from the peer node, the MLAG device will delete the Peer-Sync MACs in local MAC table.
There are three types of MAC addresses defined in MLAG: Static, Dynamic, and Peer-Sync, where Peer-Sync represents the dynamic MAC address synchronized from the MLAG peer device, and its priority is lower than that of static MAC. If one of the MLAG peer switch fails, the Peer-Sync MAC address on the other switch will be deleted from the MAC address table.
To solve the problem of MAC synchronization failure caused by the MAC synchronization packets loss,the system provides the MAC FLOOD mechanism to periodically check and synchronize the MAC addresses on the MLAG peers. When the MAC synchronization condition is satisfied, the MLAG device will flood local MAC addresses to the MLAG peer node periodically, the peer node will then add or delete the mismatched MAC, thus ensures that the MAC addresses at both ends of the MLAG peer are consistent.
You can use the run show mac-address table command to view the information about MAC address table, such as MAC address statistics, VLAN ID, MAC address, MAC address type and outbound interface. For example,
root@XorPlus# run show mac-address table
Total entries in switching table: 10
Static entries in switching table: 0
Dynamic entries in switching table: 10
VLAN MAC address Type Age Interfaces User
---- ----------------- --------- ---- ---------------- ----------
1 08:9e:01:61:66:7f Dynamic 1000000 ge-1/1/44 xorp
2 00:e0:fc:01:01:01 Dynamic 1000000 ae1 xorp
2 00:e0:fc:01:01:02 Peer-Sync 1000000 ae1 xorp
2 48:6e:73:02:00:d4 Dynamic 1000000 ae48 xorp
2 60:eb:69:d2:9c:d7 Dynamic 1000000 ae1 xorp
3 60:eb:69:d2:9c:d7 Dynamic 1000000 ae2 xorp
4094 48:6e:73:02:00:d4 Dynamic 1000000 ae48 xorp
MAC synchronization between MLAG member ports will only synchronize the MAC address of which the interface belongs to an MLAG domain.
Figure 3. MLAG Networking for MAC Synchronization
As shown in Figure 3, Switch1, Switch2 and the member ports connected to Switch3 form an MLAG, MLAG Domain 1; Switch1, Switch2 and the member ports connected to Switch4 form another MLAG, MLAG Domain 2. However, as Te-1/1/1 interface on Switch1 that connected to Switch5 does not belong to any MLAG domain, so the dynamic MAC address learned on Te-1/1/1 will not be synchronized to the MLAG member port ae1 on Switch2 during MLAG MAC synchronization. But as Te-1/1/1 is a single-homed port, it will be synchronized to the MLAG peer device on the MLAG peer-link port, and the address type is Peer-Sync in the MAC address table. For details about single-homed port, see Single-homed Port.
admin@Switch1# run show mac-address table
Total entries in switching table: 4
Static entries in switching table: 0
Dynamic entries in switching table: 4
VLAN MAC address Type Age Interfaces User
---- ----------------- --------- ---- ---------------- ------
1 08:9e:01:61:64:13 Dynamic 300 te-1/1/1 xorp
1 cc:37:ab:4f:ad:01 Peer-Sync 300 ae1 xorp
1 2c:35:58:2f:88:ab Dynamic 300 ae2 xorp
4094 8c:ea:1b:88:5b:81 Dynamic 300 ae3 xorp
admin@Switch2# run show mac-address table
Total entries in switching table: 3
Static entries in switching table: 0
Dynamic entries in switching table: 3
VLAN MAC address Type Age Interfaces User
---- ----------------- --------- ---- ----------- ----
1 08:9e:01:61:64:13 Peer-Sync 300 ae3 xorp
1 cc:37:ab:4f:ad:01 Dynamic 300 ae1 xorp
1 2c:35:58:2f:88:ab Peer-Sync 300 ae2 xorp
4094 cc:37:ab:56:6e:81 Dynamic 300 ae3 xorp
NOTE:
|
Single-homed Port
Single-homed port is a port on the MLAG peer device which provides access device single-access to the network through either MLAG master or slave device. The single-homed port on the MLAG peer devices can connect to both hosts or servers and it can also be connected to other access switch devices. As shown in Figure 4, Switch 1 and Switch 3 are single-homed devices, the ports on the MLAG peer devices connected to Switch 1 and Switch 3 are called single-homed ports. Traffic between Switch1 and Switch3 always crosses the MLAG peer-link as Switch1 and Switch3 are active on different switches. With single-homed ports, hosts and other standalone switches are able to single-home into the network.
Figure 4. MLAG network
The MAC address entries learned on the single-homed port will be synchronized to the MLAG peer device on the MLAG peer-link port, and the address type is Peer-Sync in the MAC address table. However, the MAC synchronization on the single-homed port will be done only when there is at least one MLAG that its MLAG neighbor state is TWO-WAY or FULL. This MAC synchronization ensures that the devices connected to the single-homed port can communicate normally.
Physical ports and LAG ports could be a single-homed port, MLAG member ports matching certain conditions could be single-homed ports, but the peer-link ports could not be single-homed ports.
MLAG member port is a single-homed port when one LAG port of the dual-homed access device is down, then the other LAG port becomes a single-homed port. We can also say that when MLAG interface state is ASY_LOCAL, then MLAG member port on local MLAG device is a single-homed port. MAC address entry learned on this port will be synchronized to the MLAG peer device on the MLAG peer-link port.
NOTE:
|
Application Scenarios
As shown in Figure 5, PC 2 connects to the MLAG downlink switch (Switch 2), and communicates with PC 1 through the MLAG peer devices.
Figure 5. Network 1 of PC 1 and PC 2 Communication in MLAG Topology
Normally, the traffic from PC 1 to PC 2 will go out of Port 1 to Switch 2. Any packet received from peer-link on MLAG slave device will be blocked to all MLAG member ports.
Traffic sent from PC 2 to PC1 will be hashed to one of the MLAG peer devices. If the traffic is hashed to the MLAG slave, the traffic is therefore forwarded by the slave device across the peer-link to MLAG master. This is because the MAC address learned on Port 3 will be synchronized to the peer-link port Port 5 on MLAG slave device.
When the topology changes, as shown in Figure 6, the location of PC 2 is changed and is now accessing the network through the MLAG slave device on port 3. The MAC address of PC 2 will be learned on Port 3 of MLAG Slave device. At this time, since Port 3 is a single-homed port, the MAC address entry learned on Port 3 will be synchronized to the peer-link port Port 4 on MLAG master device. The traffic sent from PC 1 to PC 2 will go out of Port 4 instead of Port 1 on MLAG master device and sent to the slave device via peer-link.
Similarly, as Port 3 is a single-homed port, MAC addresses of single-homed hosts connected to the MLAG slave device will automatically be learned by MLAG master device. The traffic flow path from PC 2 to PC 1 is similar. This ensures that the devices connected to the single-homed port can communicate normally.
Figure 6. Network 2 of PC 1 and PC 2 Communication in MLAG Topology
When considering the case of IP routing communication, as shown in Figure 7, PC1 and PC2 belong to different subnets. In this scenario, you can apply VRRP in the MLAG topology to make PC1 and PC2 can communicate with each other through IP routing. Configure two VRRP groups on the two VRRP group devices which belongs to different L3 VLAN interfaces. Configure a different virtual IP address for each VRRP group, virtual IP address 10.10.10.1 is used as the gateway for PC1 access network, and virtual IP address 20.20.20.1 is used as the gateway for PC2 access network.
Figure 7. Network 3 of PC 1 and PC 2 Communication in MLAG with VRRP Topology
Configuration Consistency
To ensure that the MLAG peer link devices appear as one device to the downstream device, and to make the MLAG function operate smoothly, the configuration of MLAG peer-link interfaces on each MLAG peer device needs to be consistent.
PicOS automatically checks the configuration consistency of the MLAG peer-link devices. Items that are checked include Native vlan-id, Untagged vlan-id, Tagged vlan-id, Mac learning, and System ID. If inconsistencies are found, the system generates a log message.
For System ID, if configured on both sides of MLAG peer and is the same then System ID configuration is consistent. If configured on only one side of MLAG peer, then System ID configuration is inconsistent. If not configured on both sides of MLAG peer, then System ID configuration is consistent.
You can use the run show mlag internal command to view the results of the consistency check. For example:
root@XorPlus# run show mlag internal
Domain-id Local-LAG Flood MAC-sync State Config-Match Role
----------------------------------------------------------------------------------------------------------
1 ae1 false false DOWN No MASTER
Config-Match can be Yes or No,
- If Config-Match is Yes, the configurations of the MLAG peer-link devices are consistent.
- If Config-Match is No, the configurations of the MLAG peer-link devices are inconsistent.
You can run the run show mlag internal domain-id detail command on each peer-link device to view the details of the checked items. For example:
root@XorPlus# run show mlag internal 1 detail
Local-LAG: ae1
Native VlanID: 100
Mac Learning: true
Untagged VlanID:
Tagged VlanID:
System ID:
Flood Control
To prevent the downstream switches from receiving multiple copies from both ends of MLAG peer, a block mask is used to prevent forwarding all the traffics received on the MLAG peer link toward the MLAG member port, as shown in Figure 8.
The forwarding block mask for a given MLAG link will be cleared off if all the MLAG member ports go down on the MLAG peer.
Figure 8. MLAG Flood Control
1. All packets (Unicast, multiacst or broadcast) received from SwitchB on SwitchA1 will be flooded to any ports in the specified vlan, peer-link included.
2. Packets received from peer-link on SwitchA2 will be blocked to all MLAG member ports.
Flood control process of traffic from uplink is similar to that of traffic from downlink, and is not mentioned here.
NOTE: All the packets received from peer-link shall be blocked to all MLAG member ports except the DHCP Offer/Ack packets. |
You can run the run show mlag internal command to view the status of flood control. For example:
root@XorPlus# run show mlag internal
Domain-id Local-LAG Flood MAC-sync State Config-Match Role
----------------------------------------------------------------------------------------------------------
1 ae1 false false DOWN No MASTER
If Flood is false, all the traffic received on the MLAG peer link toward the MLAG member port will be blocked to all MLAG member ports on MLAG peer device.
IGMP Snooping Interoperability with MLAG
To achieve IGMP snooping interoperability with MLAG, the key is to implement synchronization of the Layer 2 multicast forwarding entries between the MLAG peer devices through the peer link, which ensures that multicast traffic in the same multicast group can be normally forwarded at both ends of the MLAG peer device.
When an MLAG member port receives an IGMP protocol packet, the MLAG device generates a Layer 2 multicast forwarding entry of a router port or an IGMP member port based on this MLAG member port and floods this packet to all the links of the same multicast group.
On the MLAG peer link, the IGMP protocol packet will be forwarded to the MLAG peer in two ways.
- Firstly, this IGMP protocol packet is flooded to the MLAG peer through the Layer 2 multicast (forward only to the router port for IGMP report packet, forward to all the port in the VLAN for IGMP query packet). After the MLAG peer device receives the packet, it floods this packet to all the links of the multicast group except the link of the MLAG member port and generates a Layer 2 multicast forwarding entry of the peer link interface.
- At the same time, this packet is also synchronized to the MLAG peer the way we called it MLAG synchronization. After the MLAG peer device receives the packet, it does not forward the packet, however, generates a Layer 2 multicast forwarding entry of local MLAG member port. In this way, the Layer 2 multicast forwarding entries are synchronized on the MLAG peer devices, ensuring that the multicast traffic in the same multicast group be normally forwarded at both ends of the MLAG peer device.
When a non-MLAG member port receives an IGMP message, the MLAG device generates a Layer 2 multicast forwarding entry of a router port or an IGMP member port based on this non-MLAG member port and floods this packet to all the links of the same multicast group, including the MLAG peer link. But this packet will not be synchronized through the MLAG synchronization mechanism as described above.
For example, Figure 9 shows a network of IGMP snooping interoperation with MLAG, where Switch A1, Switch A2 and the MLAG member ports connected to Switch B form an MLAG.
Figure 9. Network of IGMP Snooping Interoperation with MLAG
When the MLAG member port (ae1) on Switch A2 receives an IGMP report message from the Host, Switch A2 generates an IGMP member port multicast forwarding entry of ae1 and floods this IGMP report message to all the router ports of the same multicast group.
On the MLAG peer link, the IGMP report message will be forwarded to the MLAG peer in two ways.
- Firstly, this IGMP report message is flooded to Switch A1 through the Layer 2 multicast (forward only to the router port for IGMP report packet, forward to all the port in the VLAN for IGMP query packet). After Switch A1 receives the packet, it floods this packet to all the links of the multicast group except the link of the MLAG member port (link of ae1) and generates a Layer 2 multicast forwarding entry of the peer link interface ae2.
- Secondly, the IGMP report message is also synchronized to Switch A1 through the MLAG synchronization mechanism. After the Switch A1 receives the packet, it does not forward the packet, however, generates a Layer 2 multicast forwarding entry of local MLAG member port ae1. In this way, the Layer 2 multicast forwarding entries are synchronized on the MLAG peer devices.
Interoperability with Other Features
• LACP
LAG (Link Aggregation) is a way of binding multiple physical links into a combined logical link. We recommend that you enable LACP on the interfaces of each link aggregation group. This configuration allows you to more easily detect compatibility between devices, unidirectional links, and provides dynamic reaction to configuration changes and link failures.
You can configure MLAG system ID for LACP, and System ID which is an optional configuration. If System ID is configured, then System ID will be used for LACP negotiation, else system MAC address will be used.
• MSTP
MLAG communication avoids the link loop but there is still a link loop risk between MLAGs of different devices and non-MLAG connections. Therefore, in order to avoid loops causing broadcast storms and making the MAC address table unstable, it is still recommended to enable the MSTP function.
• VRRP
VRRP groups combine the devices of the MLAG peer-link end to a virtual router and use the IP address of the virtual router to communicate with the external networks as the default gateway address. When the gateway fails, VRRP mechanism can elect a new gateway to transmit service traffic thus ensuring the reliable communication of the layer-3 network.
• VXLAN
Implementing VXLAN technology on the devices at the MLAG peer-link end provides overlay network on top of existing layer 2 and layer 3 technologies to support elastic compute architectures, thus makes it easier for network engineers to scale out a cloud computing environment while logically isolating cloud apps and tenants.
Typical Fault Scenarios
Downstream Link from Access Switch Down
In this case, all traffics will be sent to the MLAG SLAVE device and sent from the MLAG SLAVE device to the upstream link.
Figure 10. Typical Fault Scenario of Downstream Link Down
Upstream Link to Layer-3 Device Down
In this case, traffic load-sharing to MLAG MASTER are sent to the MLAG SLAVE device via MLAG peer-link, and then forwarded to the upstream link.
Figure 11. Typical Fault Scenario of Upstream Link Down
MLAG Peer-link Down
MLAG peer device checks the peer-link status by exchanging hello packets, if same MLAG system ID is configured on both sides of MLAG peer-link, the access switch will not be able to sense peer-link down. So the traffic will go through both links. However, if MLAG system ID is not configured, the access switch will choose the Master to transmit uplink traffic.
Figure 12. Typical Fault Scenario of Peer-link Down
MLAG Master Device Fault
When master switch reboot/shutdown, and system ID is configured, the slave device continuously uses configured system ID as the system MAC for LACP. Since system MAC for LACP has not changed, all traffic is forwarded from this functional device.
When master switch reboot/shutdown, and system ID is not configured, the peer device considers the peer-link down if it has not received the hello packet for a period of 3 times the hello-interval. If the reload-delay timer is not set to zero, it will go to the reload-delay process, refer to 1.1.6 Reload Delay for details.
Figure 13. Typical Fault Scene of MLAG Master Fault
Copyright © 2025 Pica8 Inc. All Rights Reserved.