Troubleshooting MAC-Flushes on NX-OS

An interesting client problem in one of our multi-tenant data centers came to my attention the other day. A delay sensitive client noticed a slight increase in latency (20 ms) at very intermittent intervals from his servers in our data center to specific off-net destinations. The increase in latency was localized to the pair of Nexus 7000’s functioning as the core switch layer (CSW) and the layer3 edge for this particular data center. Beyond that all appeared normal on the N7K CSWs.

A TCP dump from a normal trunk interface attached to the N7Ks, showed unicast traffic on the N7K-2 device when the N7K-1 device was setup to receive internet traffic inbound and forward it into the data center client VLANs.  The N7Ks are setup using the Cisco VPC (Virtual Port Channels).

Continue reading “Troubleshooting MAC-Flushes on NX-OS”

Detecting Layer2 Loops

We all too familiar with the devastating impact a talented layer 2 loop could have on a data center lacking sufficient controls and processes. If you are using Cisco Nexus switches in your data center, you would be happy to know that NX-OS offers an interesting new tool you should add to your loop detection list. The somewhat undocumented feature is known as (for the lack of a better name)  FWM-Loop Detection. FWM refers to the NX-OS Forwarding Manager. In Syslog it is seen as:

%FWM-2-STM_LOOP_DETECT

Continue reading “Detecting Layer2 Loops”

The Fabric ERA

“Fabric” is a loosely used term, which today creates more confusion instead of offering direction.

What exactly is a Fabric ? What is a Switch Fabric?

Greg Ferro did a post here explaining how Ethernet helped the layer 2 switch fabric evolve. Sadly the use of switch fabric did not stop there. And this is the part where the confusion trickles in.

The term fabric has been butchered (mostly by marketing people) to incorporate just about any function these days. The term ‘switch fabric’ today (in the networking industry) is broadly used to describe among others the following:

  • The structure of an ASIC, e.g., the cross bar silicon fabric.
  • The hardware forwarding architecture used within layer2 bridges or switches.
  • The hardware forwarding architecture used with routers, e.g., the Cisco CRS and its 3-stage Benes switch fabric.
  • Storage topologies like the fabric-A and fabric-B SAN architecture.
  • Holistic Ethernet technologies like TRILL, Fabric-Path, Short-Path Bridging, Q-Fabric, etc.
  • A port extender device that is marketed as a fabric extender (a.k.a. FEX) namely the Cisco Nexus 2000 series.

In short, a switch fabric is basically the interconnection of points with the purpose to transport data from one point to another. These points, as evolved with time, could represent anything from an ASIC, to a port, to a device, to an entire architecture.

Cisco added a whole new dimension to this by marketing a Port Extender device as a Fabric Extender and doing so with different FEX architectures namely VM-FEX and Adapter FEX…. More on that in the next post. :)

BGP between Cisco Nexus and Fortigate

It is not uncommon to find that different vendors have slightly different implementations when it comes to standards technologies that should work seamless.

I recently came across a BGP capability negotiation problem between a Nexus 7000 and a client Fortigate. Today’s post is not teaching about any new technologies, but instead showing the troubleshooting methodology I used to find the problem.

The setup is simple. A Nexus 7000 and a Fortigate connected via nexus layer2 hosting infrastructure, to peer with BGP.
At face value the eBGP session between Nexus 7000 and the Fortigate never came up:

N7K# sh ip bgp summary | i 10.5.0.20
Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.5.0.20   4 65123     190     190        0    0    0 0:12:30  Idle

The first steps should verify the obvious.

  •  Configuration! This check should included checking the ASNs, the peering IP addresses, source-interfaces and passwords matching.

Continue reading “BGP between Cisco Nexus and Fortigate”

Cisco Nexus User Roles using TacPlus

I previously wrote a post about the Nexus Roles and how they integrate with a TACACS server.

Cisco Documentation shows the following format to issue multiple roles from a TACACS/RADIUS server.:

shell:roles="network-admin vdc-admin"

We are using Shrubbery TACPLUS, instead of the Cisco ACS software. Last week I noticed that only one role was assigned when multiples should be assigned. Multiple roles are required when using one TACACS server to issue roles for VDC and non-VDC Nexus switches since they need different default User-Roles.

This was tested on a Nexus 5000, a Nexus 7000 and VDC on the same Nexus 7000. Different codes were tried. This was not a NX-OS bug.

Upon further investigation it was obvious, that the syntax above as provided by Cisco was specific their TACACS software, being the ACS software. But I still required multiple Roles to be assigned for my single TACACS configuration to work across multiple Nexus devices. First attempt was the lazy method. Ask uncle Google for any such encounters with a solution. That yielded no practical results. I then contacting Shrubbery for the solution, after that it became clear that possibly nobody else have experienced this problem before.

So the hunt began to find out exactly what was so different in the AAA response from the Cisco ACS software to the TACPLUS software that it did not yield the required results.

Continue reading “Cisco Nexus User Roles using TacPlus”

Low Memory Handling

Memory problems on routers is nothing new. It is generally less of a problem in current day, but is still seen from time to time.

BGP is capable of handling large amount of routes and in comparison to other routing protocols, BGP can be a big memory hog. BGP peering devices, especially full internet peering devices, require larger amounts of memory to store all the BGP routes. Thus it’s not uncommon to see a BGP router run out of memory when a certain route count limit is exceeded.

A router running out of memory, commonly called Low Memory, is always a bad thing. The result of low memory problems may vary from the router crashing, to routing processes being shut down or if you lucky enough erratic behavior causing route flaps and instability in your network. None which is desired.

Low memory can be caused by any of the following:

  •     Partial physical memory failure.
  •     Software memory bugs.
  •     Applications not releasing used memory chunks.
  •     Incorrect configuration.
  •     Insufficient memory allocation to a Nexus VDC.

Continue reading “Low Memory Handling”

Smart Port-Channels

Consider the following output.

How is this possible, when no AAA or Privilege Profiles are configured? Have a look at the interface configuration:

Is this a bug/feature/annoyance. Depending on the platform, this is a feature. This test-interface is part of a port-channel. This is a common operational mistake. How many times has it happened in one of your data centers, where an engineer accidentally made a change to an interface which was a member of a port-channel, only to bring the port-channel and possibly any customer data that traversed the link down?

Continue reading “Smart Port-Channels”

Cisco OTV (Part III)

This is the final follow-on post from OTV (Part I) and OTV (Part II).

In this final post I will go through the configuration steps, some outputs and FHRP isolation.

OTV Lab Setup

I setup a mini lab using two Nexus 7000 switches, each with the four VDCs, two Nexus 5000 switches and a 3750 catalyst switch.
I emulated two data center sites, each with two core switches for typical layer3 breakout, each with two switches dedicated for OTV and each with one access switch to test connectivity. Site1 includes switches 11-14 (four VDCs on N7K-1) and switch 15 (N5K), whereas Site2 includes switches 21-24 (four VDCs on N7K-2) and switch 32 (3750).

To focus on OTV, I removed the complexity from the transport network by using OTV on dedicated VDCs (four of them for redundancy), connected as inline OTV appliances and by connecting the OTV Join interfaces on a single multi-access network.

This is the topology:

Before configuring OTV, the decision must be made how OTV will be integrated part of the data center design.

Recall the OTV/SVI co-existing limitation. If core switches are in place, which are not the Nexus 7000 switches, OTV may be implemented natively on the new Nexus 7000 switch/es or using a VDCs. If the Nexus 7000 switches are providing the core switch functionality, then separate VDCs are required for OTV.

Continue reading “Cisco OTV (Part III)”

Cisco OTV (Part II)

This is a follow on post from OTV (Part I).

STP Separation

Edge Devices do take part in STP by sending and receiving BPDUs on their internal interface as would any other layer2 switch.

But an OTV Edge Device will not originate or forward BPDUs on the overlay network. OTV thus limits the STP domain to the boundaries of each site. This means a STP problem in the control plane of a given site would not produce any effect on the remote data centers. This is one of the biggest benefits of OTV in comparison to other DCI technologies. This is made possible because MAC reachability information is advertised and learned via the control plane protocol instead of learned using typical MAC flooding behavior.

With the STP separation between sites, the ability for different sites to use different STP technologies is made possible with OTV. I.e., one site can run MSTP while another runs RSTP. In the real world this is a nifty enhancement.

.

Multi-Homing

OTV allows multiple Edge Devices to co-exist in the same site for load-sharing purposes. (With NX-OS 5.1 that is limited to 2 OTV Edge Devices per site.)

With multiple OTV Edge Devices per site and no STP across the overlay to shut down redundant links, the possibility of an end-to-end site loops are created. The absence of STP between sites holds valuable benefits, but a loop prevention mechanism is still required, so an alternative method was used. The boys who wrote OTV, decided on electing a master device responsible for traffic forwarding (similar to some non-STP protocols).

With OTV this master elected device is called an AED (Authoritative Edge Device).

An AED is an Edge Device that is responsible for forwarding the extended VLAN frames in and out of a site, from and to the overlay network. It is a very important to understand this before carrying on. Only the AED will forward traffic out of the site onto the overlay. With optimal traffic replication in a transport network, a site’s broadcast and multicast traffic will reach every Edge Device in the remote site. Only the AED in the remote site will forward traffic from the overlay into the remote site. The AED thus ensures that traffic crossing the site-overlay boundary does not get duplicated or create loops when a site is multi-homed.

Continue reading “Cisco OTV (Part II)”

Cisco OTV (Part I)

OTV(Overlay Transport Virtualization) is a technology that provide layer2 extension capabilities between different data centers. In its most simplest form OTV is a new DCI (Data Center Interconnect) technology that routes MAC-based information by encapsulating traffic in normal IP packets for transit.

Cisco has submitted the IETF draft but it is not finalized yet. draft-hasmit-otv-01

OTV Overview

Traditional L2VPN technologies, like EoMPLS and VPLS, rely heavily on tunnels. Rather than creating stateful tunnels, OTV encapsulates layer2 traffic with an IP header and does not create any fixed tunnels.

OTV only requires IP connectivity between remote data center sites, which allows for the transport infrastructures to be layer2 based, layer3 based, or even label switched. IP connectivity as the base requirement along some additional connectivity requirements that will be covered in this post.

OTV requires no changes to existing data centers to work, but it is currently only supported on the Nexus 7000 series switches with M1-Series linecards.

A big enhancement OTV brings to the DCI realm, is its control plane functionality of advertising MAC reachability information instead of relying on the traditional data plane learning of MAC flooding. OTV refers to this concept as MAC routing, aka, MAC-in-IP routinig. The MAC-in-IP routing is done by encapsulating an ethernet frame in an IP packet before forwarded across the transport IP network. The action of encapsulating the traffic between the OTV devices, creates what is called an overlay between the data center sites. Think of an overlay as a logical multipoint bridged network between the sites.

OTV is deployed on devices at the edge of the data center sites, called OTV Edge Devices. These Edge Devices perform typical layer-2 learning and forwarding functions on their site facing interfaces (the Internal Interfaces) and perform IP-based virtualization functions on their core facing interface (the Join Interface) for traffic that is destined via the logical bridge interface between DC sites (the Overlay Interface).

Each Edge Device must have an IP address which is significant in the core/provider network for reachability, but is not required to have any IGP relationship with the core. This allows OTV to be inserted into any type of network in a much simpler fashion.

Lets look at some OTV terminology.

.

OTV Terminology

Continue reading “Cisco OTV (Part I)”

Playtime

Its playtime. I am fortunate enough to have the following unboxed and at my disposal for some time.


.

It is two Cisco Nexus 7010 chassis, meant for a another new 10Gb DC coming online soon.
Each comprise of the following configuration:

  • 2x SUP-1’s: First generation Supervisor.
  • 3x FAB-1: Cross connect Fabric card module.
  • 2x N7K-M132XP: M1-Series 32-Port 1/10Gb Ethernet Module, 80Gb Fabric.
  • 1x N7K-M148GS: M1-Series 48-Port 1Gb Ethernet Modules, 46Gb Fabric.

The the other switches are:

  • 2x Nexus 5010’s
  • 2x Nexus 2224TP (Fabric Extender)
  • 3x Catalyst 3750G
  • 1x Lost Catalyst 2960.

Unfortunately I do not have a F1 series linecard, it would be interesting testing Cisco FabricPath, but I can test OTV (Overlay Transport Virtualization). The messy cable configuration was done for that exact purpose, to test OTV. ;D

So in the next couple days I will cover the theory, configuration, pro’s and con’s of using OTV as a DCI (Data Center Interconnect).

RBAC with AAA Authentication

A earlier post introduced the Cisco Nexus concept of User Roles, which is a local command authorization method. There are some default system user roles.

RBAC (Role-Based Access Control) is the name/ability to create custom user roles locally on a Cisco Nexus. This gives the administrator the flexibility to define a group of certain commands to be allowed or denied for a selected role. Users can then be designated to belong to certain user roles. This designation can either be done locally on each switch or by using TACACS.

As discussed in the earlier post, AAA authorization and the user roles are mutually exclusive, since AAA Authorization overrides the permissions allowed with user roles. But using RBAC along with AAA Authentication (not Authorization), does bring some neat options to the table, depending obviously on a given network design and requirements.

How does RBAC work?

Custom user roles are defined by giving the role a name and by creating rules within the role. Each rule has a number, to decide the order in which the rules are applied. Rules are applied in descending order. I.e., rule 3 is applied before rule 2, which is applied before rule 1. This means a rule with a higher number overrides a rule with a lower number. Each role may have up to 256 rules configured. All the rules combined within a role determine what operations the role allows the associated user to perform.

Rules can be applied for the following parameters:

  • Command — A command or group of commands defined in a regular expression.
  • Feature — Commands that apply to a function provided by the Cisco Nexus switch.
  • Feature group — Default or user-defined group of features.

Continue reading “RBAC with AAA Authentication”

Cisco Nexus User Roles

IOS relies on privilege levels.  Privilege levels (0-15) defines locally what level of access a user has when logged into an IOS device, i.e. what commands are permitted. This only applies in the absence of AAA being configured. There are 3 default privilege levels on IOS, but really only two that are relevant:

  • Privilege Level 1 — Normal level on Telnet; includes all user-level commands at the router> prompt.
  • Privilege Level 15 — Includes all enable-level commands at the router# prompt.

NX-OS uses a different concept for the same purpose, known as User Roles. User Roles contain rules that define the operations allowed for a particular user assigned to a role. There are default User Roles:

  • Network-Admin—Complete read-and-write access to the entire NX-OS device (only available in the default VDC).
  • Network-Operator—Complete read access to the entire NX-OS device (Default User Role).
  • VDC-Admin—Read-and-write access limited to a VDC (VDCs are not yet available on Nexus 5000).
  • VDC-Operator—Read access limited to a VDC (Default User Role).

A VDC (Virtual Device Context) is a logical separation of control plane hardware resources into virtualized layer3 switches. Don’t worry to much about what a VDC is for now, it is not really relevant to the purpose of this post.

When a NX-OS device is setup for the first time, during the first login, a Network-Admin account must be specified and subsequently be used to login. Arguably a bit more secure that IOS. Any additional users created locally after that will by default receive the User Role “Network-Operator“, unless specified differently:

User Roles are local to a switch and only relevant in the absence of AAA Authorization being configured. To see the permissions of a particular User Role use:

N5K-2# sh role name network-operator
Role: network-operator
  Description: Predefined network operator role has access to all read
  commands on the switch
  -------------------------------------------------------------------
  Rule    Perm    Type        Scope               Entity
  -------------------------------------------------------------------
  1       permit  read

Continue reading “Cisco Nexus User Roles”

Jumbo MTU on Nexus 5000

Setting a per interface MTU (maximum transmission unit) is not supported on the Nexus 5000/2000 series switches.
If a Jumbo packet is required to traverse a Nexus 5000 series switch , the jumbo MTU must be set in a policy-map and applied to the ‘Sytem QOS’.

Configuration:

Configuration, PRE NX-OS 4.1:
policy-map JUMBO
 class class-default
  mtu 9216
system qos
 service-policy JUMBO

Configuration with POST NX-OS 4.1:
policy-map type network-qos JUMBO
 class type network-qos class-default
  mtu 9216
system qos
 service-policy type network-qos JUMBO

Continue reading “Jumbo MTU on Nexus 5000”