VXLAN BGP EVPN : VXLAN Control Plane

Today I am going to talk about the VXLAN and support of the VXLAN BGP EVN and the considerations about the VXLAN BGP EVPN.

Below are the major points when you are going to deploy the VXLAN BGP EVN in your environment. make sure you take these recommendations in your network design. I am assuming that you know about the VXLAN BGP EVPN but let me take you through it once

What is VXLAN ?
VXLAN stands for Virtual Extensible LAN which is a overlay protocol in the fabric network. VXLAN is build over the traditional Layer underlay network.

What is VXLAN BGP EVPN?
To overcome the limitations of Flood and Learn, we need to use MP-BGP EVPN protocol as a control plane in VXLAN environment. With the help of MP-BGP EVPN control plane, you can have the protocol-based VTEP peer discovery and end-host reachability information distribution. It will  allows more scalable VXLAN overlay network designs suitable for cloud environment which can be public or private cloud. The MP-BGP EVPN control plane has a set of features and it is very much beneficial for the overlay networks. It actually reduces or eliminates traffic flooding in the overlay network and enables optimal forwarding for both east-west and north-south.

Below is a sample network topology between two datacenter of VXLAN with the use of the VXLAN BGP EVN control plane. I hope you now understand what VXLAN BGP EVN. 

As i talked again and again if you need to understand this topic you need to understand the basic concept if the overlay network and the use cases of this kind of overlay networks. I knew VXLAN is one of the most demanding protocol now a days where people are moving towards the next generation networks.

Please let me know if you are facing issues to understand the concept of the VXLAN in detail. I will try to write another article on the VXLAN in detail with the configurations. 

Fig 1.1- VXLAN BGP EVN topology between two Datacenter
Let's discuss about the points to remember for VXLAN BGP EVN

  • With the help of VXLAN EVPN setup with 2K VNI scale configuration, the control plane down time takes more than 200 seconds. Make sure you need to avoid BGP flap and so you need to configure the graceful restart time to 300 seconds.
  • You should remember that the sub-interfaces as core links are not supported in multisite EVPN.
  • With the help of VXLAN EVPN setup, border leaves must use unique route distinguishers or RD in a MPLS environment, For this you need to use auto rd command. Important thing to noted that it will not supported to have same route distinguishers in different border leaves.
  • Make sure you always remember that the ARP suppression is only supported for a VNI if the VTEP hosts the First-Hop Gateway (Distributed Anycast Gateway) for this VNI. The VTEP and the SVI for this VLAN have to be properly configured for the distributed anycast gateway operation, for example, global anycast gateway MAC address configured and anycast gateway feature with the virtual IP address on the SVI.
  • Another important thing to remember that DHCP snooping is not supported on VXLAN VLANs.
  • Another thing to remember that the SPAN TX for VXLAN encapsulated traffic is not supported for the Layer 3 uplink interface.
  • RACLs are not supported on Layer 3 uplinks for VXLAN traffic. Egress VACLs support is not available for de-capsulated packets in the network to access direction on the inner payload.
  • About the Qos you should note that the QoS classification is not supported for VXLAN traffic in the network to access direction on the Layer 3 uplink interface
  • Important thing to remember again that VTEP does not support Layer 3 subinterface uplinks that carry VxLAN encapsulated traffic
  • In the VXLAN environment the Layer 3 interface uplinks that carry VxLAN encapsulated traffic do not support subinterfaces for non-VxLAN encapsulated traffic
  • For all the Non-VxLAN sub-interface VLANs cannot be shared with VxLAN VLANs
  • For the Point to multipoint Layer 3 and SVI uplinks are not supported. Since both uplink types can only be enabled point-to-point, they cannot span across more than two switches
  • Another point to be noted here that for EBGP, it is recommended to use a single overlay EBGP EVPN session between loopbacks.
  • For the VXLAN BGP environment you can Bind NVE to a loopback address that is separate from other loopback addresses that are required by Layer 3 protocols. A best practice is to use a dedicated loopback address for VXLAN.
  • Make sure you consider that the VXLAN BGP EVPN does not support an NVE interface in a non-default VRF.
  • Always configure a single BGP session over the loopback for an overlay BGP session
  • VXLAN UDP port number is used for VXLAN encapsulation. For Cisco Nexus NX-OS, the UDP port number is 4789. It complies with IETF standards and is not configurable.
  • For Cisco Nexus 9200 Series switches that have the Application Spine Engine. There exists a Layer 3 VXLAN (SVI) throughput issue.
  • Another point to remember that the VXLAN does not support co-existence with the GRE tunnel feature or the MPLS (static or segment-routing) feature on Cisco Nexus 9000 Series switches with a Network Forwarding Engine (NFE)
  • Let's talk about the IP multicast, for establishing IP multicast routing in the core, IP multicast configuration, PIM configuration, and RP configuration is required.
  • Make sure you know that the VTEP to VTEP unicast reachability can be configured through any IGP/BGP protocol
  • It is always recommended that when you are changing the IP address of a VTEP device, shut/shut loopback the NVE interface before changing the IP address
  • Again with the Multicast environment, RP for the multicast group should be configured only on the spine layer. Use the anycast RP for RP load balancing and redundancy.
  • As i earlier said as well, you need to Configure ARP suppression with BGP-EVPN and for that use the hardware access-list tcam region arp-ether size double-wide command to accommodate ARP in this region
  • In the datacenter environment for a vPC device, BUM traffic (broadcast, unknown-unicast, and multicast traffic) from hosts is replicated on the peer-link. A copy is made of every native packet and each native packet is sent across the peer-link to service orphan-ports connected to the peer VPC switch
  • Make sure you know about it that to prevent traffic loops in VXLAN networks, native packets ingressing the peer-link cannot be sent to an uplink. However, if the peer switch is the encapper, the copied packet traverses the peer-link and is sent to the uplink.
  • Again in the datacenter environment on a vPC pair, shutting down NVE or NVE loopback on one of the VPC nodes is not a supported configuration. This means that traffic failover on one-side NVE shut or one-side loopback shut is not supported
  • For redundant anycast RPs configured in the network for multicast load-balancing and RP redundancy are supported on VPC VTEP topologies
  • You should always enable vPC peer-gateway configuration. For peer-gateway functionality, at least one backup routing SVI is required to be enabled across peer-link and also configured with PIM. This provides a backup routing path in the case when VTEP loses complete connectivity to the spine. Remote peer reachability is re-routed over the peer-link in this case.
  • Make sure you will take care of it that when changing the secondary IP address of an anycast VPC VTEP, the NVE interfaces on both the VPC primary and the VPC secondary should be shut before the IP changes are made.
  • Make a note of it that it provide redundancy and failover of VXLAN traffic when a VTEP loses all of its uplinks to the spine, it is recommended to run a Layer 3 link or an SVI link over the peer-link between VPC peers.
  • If you need to configure the DHCP then it is noted that DHCP Relay is required in VRF for DHCP clients or if loopback in VRF is required for reachability test on a VPC pair, it is necessary to create a backup SVI per VRF with PIM enabled.
  • Let's talk about the encapsulation of the MAC with the UDP. The MAC-to-UDP encapsulation, VXLAN introduces 50-byte overhead to the original frames. The maximum transmission unit (MTU) in the transport network needs to be increased by 50 bytes. If the overlays use a 1500-byte MTU, the transport network needs to be configured to accommodate 1550-byte packets at a minimum. Jumbo-frame support in the transport network is required if the overlay applications tend to use larger frame sizes than 1500 bytes.