NOTES:
- The switch platforms which use this feature are:
- Trident2
- Trident2+
- Tomahawk
- Tomahawk+
- Trident3
- Maverick
- The switch platforms of ASIC Tomahawk 2 only supports VXLAN L2-VNI, not L3-VNI.
- If VXLAN is deployed in an MLAG domain, it behaves a little differently. For details, see MLAG Configuration.
About VXLAN
Virtual Extensible LAN (VXLAN) is an overlay network virtualization technology. An overlay network is a virtual network that is built on top of existing network Layer 2 and Layer 3 technologies to support elastic compute architectures. VXLAN makes it easier for network engineers to scale out a cloud computing environment while logically isolating cloud apps and tenants.
VXLAN Technology
VXLAN uses UDP-based encapsulation to tunnel Ethernet frames and transfers original data packets as tunnel payloads. With the outer UDP tunnel, inner payload data can be quickly transferred on the layer 2 and layer 3 networks. To provide the capability of broadcast domain addressing, the VXLAN technology uses layer 3 IP multicast to replace the Ethernet broadcast. Therefore, the broadcast, unknown unicast, and multicast (BUM) packets can be transferred on virtual networks through broadcasting. For more VXLAN details, please read RFC7348.
VXLAN Packets
As shown in Figure 1-1, a VXLAN packet consists of the outer encapsulation and the inner payloads.
- Flags (8 bits): The flag I must be set to 1 for a valid VXLAN Network Identifier (VNI). The other 7 bits (labeled as R) are reserved fields and must be set to 0 on transmit and ignored on receive.
- VXLAN segment ID or VXLAN VNI: This parameter contains 24 bits and is used to designate the individual VXLAN overlay network on which the VMs are located.
- Reserved fields (24 bits and 8 bits): This parameter must be set to 0 on transmit and ignored on receive.
- The destination port number assigned to the outer tunnel is 4789, which is dedicated.
However, the new addition of VXLAN message encapsulation also introduces a problem with the setting of the MTU value.
In general, the default MTU of a VM or host is 1500 bytes, that is, the maximum original Ethernet message is 1500 bytes.
This message will be encapsulated with a new message header of 50 bytes (VXLAN header 8 bytes + UDP header 8 bytes + external IP header 20 bytes + external MAC header 14 bytes) when it is encapsulated with the VXLAN header and goes through the VTEP.
After the encapsulation, the packet size becomes 1550 bytes.
VXLAN packets are not allowed to be fragmented and put back togather. It is therefore necessary that all intermediate devices have at least the same MTU if not higher than the originating VXLAN encapsulated packet.
If the original packet of 1500 bytes is encapsulated with VXLAN header, the resulting packet size becomes 1550. If the MTU value of the intermediate device is not convenient to change, then setting the MTU value of the virtual machine to 1450 can also solve this problem temporarily.
VXLAN Inner 802.1Q
Encapsulation mode
Encapsulation means the flow from access ports to network ports. Use one of the following options to specify actions about 802.1Q tag while encapsulation.
- none: Nothing will change, untagged packets will stay untagged, tagged packets will stay tagged.
- service-vlan-add: Add 802.1Q tag for untagged packets, and nothing changed with tagged packets. Encapsulation vlan is required.
- service-vlan-add-delete: Add 802.1Q tag for untagged packets, and delete tag for tagged packets. Encapsulation vlan is required.
- service-vlan-add-replace: Add 802.1Q tag for untagged packets, and replace tag for tagged packets. Encapsulation vlan is required.
- service-vlan-delete: Delete 802.1Q tag for tagged packets, and nothing changed with untagged packets. This is default value according to RFC 7348.
- service-vlan-replace: Replace vlan id of 802.1Q tag for tagged packets, and nothing changed with untagged packets. Encapsulation vlan is required.
Decapsulation-mode
Decapsulation means the flow from network ports to access ports.
- none: Nothing will change, untagged packets will stay untagged, tagged packets will stay tagged.
- service-vlan-add: From network ports to access ports, add 802.1Q tag for both untagged/tagged packets. If the access port is matched by port and vlan, the vlan id of the tag being added will be that vlan, otherwise will be PVID of that port.
- service-vlan-add-delete: From network ports to access ports, add 802.1Q tag for both untagged/tagged packets. If the access port is matched by port and vlan, the vlan id of the tag being added will be that vlan, otherwise will be PVID of that port. From access to access, delete tag for tagged packets.
- service-vlan-add-replace: From network ports to access ports, add 802.1Q tag for both untagged/tagged packets. If the access port is matched by port and vlan, the vlan id of the tag being added will be that vlan, otherwise will be PVID of that port. From access to access, replace tag for tagged packets.This is the default value.
- service-vlan-delete: From access to access, delete tag for tagged packets.
- service-vlan-replace: From access to access, replace tag for tagged packets.
- service-vlan-per-port: The decapsulated packet can be tagged or untagged dynamically based on the setting on the output port.
Based on the above description, please see the following three tables for the detailed traffic changes.
The below table shows the traffic changes in the case that vlans in the access side are binded with a vxlan in the network side.
Access→Access (configure with decapsulation mode) | Access→Network (configure with encapsulation mode) | Network→Access (configure with decapsulation mode) | |
---|---|---|---|
none | untag-->tag(PVID) tag-->remain tag | untag-->tag(PVID) tag->remain tag | untag-->untag tag-->remain tag |
service-vlan-add | untag-->tag(PVID) tag->remain tag | untag-->tag(PVID) tag->remain tag | untag-->tag(add vxlan-vlan) tag-->double tag(outer layer add vxlan-vlan) |
service-vlan-add-delete | untag-->untag tag-->untag | untag-->untag tag->untag(been deleted) | untag-->tag(add vxlan-vlan) tag-->double tag(outer layer add vxlan-vlan) |
service-vlan-add-replace | untag-->tag(PVID) tag-->remain tag | untag→tag(configured VLAN) tag->tag(configured VLAN) | untag-->tag(add vxlan-vlan) tag-->double tag(outer layer add vxlan-vlan) |
service-vlan-delete | untag-->untag tag-->untag | untag-->untag tag->untag | untag-->untag tag-->remain tag |
service-vlan-replace | untag-->tag(PVID) tag-->remain tag | untag→tag(configured VLAN) tag->tag(changed to encapsulation vlan) | untag-->untag tag-->remain tag |
service-vlan-per-port | The decapsulated packet can be tagged or untagged dynamically based on the setting on the output port. |
VXLAN ECMP
In L2/L3, VXLAN ECMP is supported. Picos supports up to 32-way ECMP.
- The VXLAN ECMP does not need special configuration. It entirely depends on the routing ECMP. The route ECMP configure link: ECMP (Equal-Cost Multipath Routing) Configuration
- PicOS uses info from VXLAN header for hash calculation to ensure better performance.
VXLAN Mac Learning
The VTEP performs source MAC learning on the VNI as a Layer 2 switch.
- The switch receives traffic from the local VTEP to the remote VTEP, the VTEP learns the source MAC address in the access port.
- The switch receives traffic from the remote VTEP to the local VTEP, the VTEP learns the source MAC address in the network port.
A VNI MAC address table includes the following types of MAC address entries:
- Access port--Dynamic MAC address entries learned from the local VTEP. VXLAN does not support local configure static MAC address.
- Network port--Include static and dynamic MAC entries.
Static mac--Configure static mac address entries on VXLAN tunnel interfaces.
Dynamic mac--The MAC address entries learned from incoming traffic on VXLAN tunnels. The learned MAC addresses are contained in the inner Ethernet header source MAC.
On network port, the configure static mac entry has higher priority than dynamic mac entries.
VXLAN Traffic Forwarding
Unicast Traffic
- The switch receives traffic from the access port. The VTEP encapsulates the original Ethernet frame with an outer MAC header, outer IP header, and a VXLAN header. The source IP address is the source VTEP's VXLAN tunnel source IP address.
- The local VTEP forwards the encapsulates packets to the VXLAN tunnel a destination IP address.
- The remote VTEP decapsulates the packet and forwards the frame to access port.
Broadcast and Unknown Traffic
- The switch receives traffic from the access port. The VTEP encapsulates the original Ethernet frame with an outer MAC header, outer IP header and a VXLAN header. The source IP address is the source VTEP's VXLAN tunnel source IP address.
- The local VTEP flood encapsulates packets to the VXLAN tunnel all destination IP address.
- The all remote VTEP decapsulates the packet and forwards the frame to access port.
Configure to map VLAN to VXLAN VNI Step
VxLAN supported on PicOS L2/L3 switch. To configure Step, pleae see below.
Configure VXLAN soure interface
set vxlans source-interface loopback address 10.10.10.25 commit
Create VXLAN VNI
set vxlans vni 10010 commit
Configure vtep address for VXLAN VNI
set vxlans vni 10010 flood vtep 10.10.10.12 commit
Add vlan into VXLAN VNI
set vxlans vni 10010 vlan 100 commit
Application Scenario Limitation
In L2 GRE or VXLAN networks, only one next hop is allowed for the same egress interface. In the following figure, the same egress interface on Switch1 has two tunnels, that is, two next hops, which is not allowed.
However, multiple L2 GRE or VXLAN tunnels can exist from the same egress port on Switch1 if connected via the IP router, ensuring that one egress interface has only one next hop, as shown in the figure below.