Cisco OTV (Part I)June 16, 2011
OTV(Overlay Transport Virtualization) is a technology that provide layer2 extension capabilities between different data centers. In its most simplest form OTV is a new DCI (Data Center Interconnect) technology that routes MAC-based information by encapsulating traffic in normal IP packets for transit.
Cisco has submitted the IETF draft but it is not finalized yet. draft-hasmit-otv-01
Traditional L2VPN technologies, like EoMPLS and VPLS, rely heavily on tunnels. Rather than creating stateful tunnels, OTV encapsulates layer2 traffic with an IP header and does not create any fixed tunnels.
OTV only requires IP connectivity between remote data center sites, which allows for the transport infrastructures to be layer2 based, layer3 based, or even label switched. IP connectivity as the base requirement along some additional connectivity requirements that will be covered in this post.
OTV requires no changes to existing data centers to work, but it is currently only supported on the Nexus 7000 series switches with M1-Series linecards.
A big enhancement OTV brings to the DCI realm, is its control plane functionality of advertising MAC reachability information instead of relying on the traditional data plane learning of MAC flooding. OTV refers to this concept as MAC routing, aka, MAC-in-IP routinig. The MAC-in-IP routing is done by encapsulating an ethernet frame in an IP packet before forwarded across the transport IP network. The action of encapsulating the traffic between the OTV devices, creates what is called an overlay between the data center sites. Think of an overlay as a logical multipoint bridged network between the sites.
OTV is deployed on devices at the edge of the data center sites, called OTV Edge Devices. These Edge Devices perform typical layer-2 learning and forwarding functions on their site facing interfaces (the Internal Interfaces) and perform IP-based virtualization functions on their core facing interface (the Join Interface) for traffic that is destined via the logical bridge interface between DC sites (the Overlay Interface).
Each Edge Device must have an IP address which is significant in the core/provider network for reachability, but is not required to have any IGP relationship with the core. This allows OTV to be inserted into any type of network in a much simpler fashion.
Lets look at some OTV terminology.
OTV Edge Device
- Is a device (Nexus 7000 or Nexus 7000 VDC) that sits at the edge of a data center, performing all the OTV functions, with the purpose to connect to other data centers.
- The OTV edge device is connected to the layer2 DC domain as well as the IP transport network.
- With NX-OS 5.1 a maximum of two OTV edge devices can be deployed on a site to allow for redundancy.
- Are the layer2 interfaces on the OTV Edge Device configured as a trunk or an access port.
- Internal interfaces take part in the STP domain and learns MAC addresses as per normal.
- Is a layer3 interface on the OTV Edge Device that connects to the IP transport network.
- This interface is used as the source for OTV encapsulated traffic that is sent to remote OTV Edge Devices.
- With NX-OS 5.1 this must be a physical interface or layer3 port channel. Loopback interfaces are not supported in the current implementations.
- A single Join interface can be defined and associated with a given OTV overlay.
- Multiple overlays can also share the same Join interface.
- Is a logical multi-access and multicast-capable interface where all the OTV configuration are explicitly defined by a user.
- The overlay interface acts as a logical bridge interface between DC sites to show which layer2 frames should be dynamically encapsulated by OTV before forwarded out the join interface.
- Is the multicast group used by OTV speaker in an overlay network.
- A unique multicast address is required for each overlay group.
- Used to encapsulate any layer2 multicast traffic that is extended across the overlay
- Are the VLANs that are explicitly allowed to be extended across the overlay between sites.
- If not explicitly allowed, the MAC addresses from a VLAN will not be advertised across the overlay.
- Is the VLAN used for communication between local OTV edge devices within a site.
- Is used to facilitate the role election of the AED (Authoritative Edge Devices).
- The Site Vlan must be exist and be active (defined or use default configuration).
Lets take a deeper look at how OTV works at the control plane.
As mentioned already OTV relies on the control plane to advertise MAC reachability information. The underlying routing-protocol used in the control-plane is IS-IS (Intermediate System to Intermediate System). IS-IS hellos and LSPs are encapsulated in the OTV IP multicast header. The OTV IS-IS packets use a distinct Layer-2 multicast destination address. Therefore, OTV IS-IS packets do not conflict with IS-IS packets used for other technologies
The use of IS-IS is obvious to two reasons.
– Firstly IS-IS does not use IP to carry routing information messages, it uses CLNS. Thus IS-IS is neutral regarding the type of network addresses for which it can route traffic, making it ideal to route MAC reachability information.
– Secondly through the use of TLVs. A TLV (Type, Length, Value) is an encoding format used to add optional information to data communication protocols like IS-IS. This is how IS-IS could easily be extended to carry the new information fields
Do you need to understand IS-IS to understand OTV? Do you need to know how the engine of a car works in order to drive it? No, but its best to have at least a base understanding how it all fits together.
Before any MAC reachability information can be exchanged, all OTV Edge Devices must become “adjacent”.
- This is possible by using the specified OTV Control-Group across the transport infrastructure to exchange the control protocol messages and advertise the MAC reachability information.
- Additional documentation indicates that unicast transport support will be possible in future Cisco software releases (post NX-OS 5.1), by using a concept known as an “Adjacency Server”. I will cover that later when relevant.
For now lets focus on using multicast in the transport infrastructure. All OTV Edge Devices should be configured to join a specific ASM (Any Source Multicast) group where they simultaneously play the role of receiver and source. This is a multicast host functionality.
Control Plane Neighbor Discovery
- Each OTV Edge Device sends an IGMP report to join the specific ASM group used to carry control protocol exchanges. The Edge Devices join the group as hosts. This is IGMP, not PIM.
- OTV Hello packets are generated to all other OTV Edge Devices, to communicate the local Edge Devices existence and to trigger the establishment of control plane adjacencies.
- The OTV Hello packets are sent across the logical overlay to the remote device. This action implies that the original frames are OTV encapsulated by adding an external IP header. The source IP address is that of the Join interface, and the destination is the ASM multicast group as specified for the control traffic.
- With a multicast enabled transport network, the multicast frames are replicated for each OTV device that joined the multicast control group.
- The receiving OTV Edge Devices strips of the encapsulated IP header. Before it is passed to the control plane for processing. (6).
Once the OTV Edge Devices have discovered each other, they are ready to exchange MAC address reachability information, which follows a very similar process.
Control Plane MAC Address Advertisement
- The OTV Edge Device in the West data center site learns new MAC addresses (MAC A, B and C on VLAN 100) on its internal interface. This is done via traditional data plane learning.
- An OTV Update message is created containing information for MAC A, MAC B and MAC C. The message is OTV encapsulated and sent into the Layer 3 transport. Same as before, the source IP address is that of the Join interface, and the destination is the ASM multicast group.
- With a multicast enabled transport network, the multicast frames are replicated for each OTV device that joined the multicast control group. The OTV packets are decapsulated and handed to the OTV control plane.
- The MAC reachability information are imported into the MAC Address Tables (CAMs) of the Edge Devices. The interface information to reach MAC-A, MAC-B and MAC-C is the IP address from the Join Interface on the West OTV Edge Device.
Once the control plane adjacencies between the OTV Edge Devices are established and MAC address reachability information have been exchanged, traffic between the sites are possible.
It is important to note, that traffic within a site will not traverse the overlay, and why should it? The OTV edge will have the destination MAC pointing towards a local interface.
Data Plane Traffic Forwarding
- A layer2 frame is destined to a MAC address learned via the overlay, since the next-hop interface is across the overlay.
- The original layer2 frame is OTV encapsulated and the DF bit is set. The overlay encapsulation format is the layer2 frame encapsulated in UDP using GRE with the destination port 8472 used. The source IP address is that of the Join interface. The destination IP is not the multicast group, but the IP address of the Join Interface from a remote OTV edge that advertised the MAC-3.
- The Unicast frame is carried across the transport infrastructure directly to remote OTV Edge Device. If it was a broadcast frame it would reach all remote OTV Edge Devices. If it was a multicast it would only be forwarded to remote OTV Edge Devices that have subscribing members using a OTV data group address.
- The remote OTV Edge Device decapsulates the frame exposing the original Layer 2 packet.
- The remote OTV performs a layer2 lookup on the original Ethernet frame and determines the exit interface towards the destination.
- The frame reaches it destination.
OTV Header Format
Some consideration must be given to the MTU across the transport infrastructure. Consider the OTV packet header layout.
A 42 byte OTV header is added and the DF (Don’t Fragment) bit is set on ALL OTV packets. The DF bit is set because the Nexus 7000 does not support fragmentation and reassembly. The source VLAN ID and the Overlay ID is set, and the 802.1P priority bits from the original layer2 frame is copied to the OTV header, before the OTV packet is IP encapsulated. Increasing the MTU size of all transport interfaces are required for OTV. This challenge is no different from other DCI technologies like VPLS and EoMPLS.
In Part II we will have look at the rest of OTV:
- STP BPDU Handling
- Multi-homing (Load-Balancing)
- FHRP Isolation