An interesting client problem in one of our multi-tenant data centers came to my attention the other day. A delay sensitive client noticed a slight increase in latency (20 ms) at very intermittent intervals from his servers in our data center to specific off-net destinations. The increase in latency was localized to the pair of Nexus 7000’s functioning as the core switch layer (CSW) and the layer3 edge for this particular data center. Beyond that all appeared normal on the N7K CSWs.
A TCP dump from a normal trunk interface attached to the N7Ks, showed unicast traffic on the N7K-2 device when the N7K-1 device was setup to receive internet traffic inbound and forward it into the data center client VLANs. The N7Ks are setup using the Cisco VPC (Virtual Port Channels).
Continue reading “Troubleshooting MAC-Flushes on NX-OS”
We all too familiar with the devastating impact a talented layer 2 loop could have on a data center lacking sufficient controls and processes. If you are using Cisco Nexus switches in your data center, you would be happy to know that NX-OS offers an interesting new tool you should add to your loop detection list. The somewhat undocumented feature is known as (for the lack of a better name) FWM-Loop Detection. FWM refers to the NX-OS Forwarding Manager. In Syslog it is seen as:
Continue reading “Detecting Layer2 Loops”
Another trivial post. The upcoming posts following this one will take a more in-depth look at the Nexus technologies.
So you do an non-ISSU NX-OS upgrade on a Nexus 5000 switch and something goes wrong. After reload you get the following prompt:
...Loader Version pr-1.3
The switch did not successfully boot from the images it was suppose to. How to go about restoring it?
Continue reading “N5K Stuck in Boot Mode”
Perhaps another trivial post, but if you don’t know about it, you might find it extremely useful.
Cisco NX-OS has an on-device log file of the exec level configuration commands entered successfully. Obviously similar informational can be obtained from the TACACS logs, but there is a certain benefit in having directly on CLI.
The command is:
#show accounting log
Continue reading “Nexus Accounting Log”
This is a interesting but a trivial post. Everybody know about the interface command “load-interval” that changes the time period over which the interface packet-rate and throughput statistics are averaged.
I discovered an addition to this command on the Nexus the other day while poking around. NX-OS allows multiple counter intervals to be configured on the same interface. This allows different sampled intervals to be listed at the same time.
The configuration is easy:
load-interval counter 1 40
load-interval counter 2 60
load-interval counter 3 180
Continue reading “Nexus load intervals”
When upgrading a Nexus 7000 to NX-OS version 5.2 (using more than 1 VDC) or to NX-OS v6+, Cisco claims the need to upgrade the system memory to 8Gb.
Note I have run on v5.2 using only 4Gb per SUP using 2 VDCs and it has worked just fine, but I should mention that the box was not under heavy load.
See how much memory your N7K has on a SUP by using the following command:
N7K# show system resources
Load average: 1 minute: 0.47 5 minutes: 0.24 15 minutes: 0.15
Processes : 959 total, 1 running
CPU states : 3.0% user, 3.5% kernel, 93.5% idle
Memory usage: 4115776K total, 2793428K used, 1322348K free
The upgrade per SUP would need the Cisco Bundle upgrade package (Product code: N7K-SUP1-8GBUPG=). One package has one 4Gb module. (see picture below) If you have two SUPs you would need two bundles. Notice the 8Gb sticker on module in the red block.
I am a MAC user and I have been looking but could not find a OmniGraffle Stencil with the Cisco Nexus icons, so I ended making one.
I have also submitted the stencil to Graffletopia.com
Feel free to download it and from Graffletopia or Mediashare:Cisco Nexus Hardware.gstencil.zip
I previously wrote a post about the Nexus Roles and how they integrate with a TACACS server.
Cisco Documentation shows the following format to issue multiple roles from a TACACS/RADIUS server.:
We are using Shrubbery TACPLUS, instead of the Cisco ACS software. Last week I noticed that only one role was assigned when multiples should be assigned. Multiple roles are required when using one TACACS server to issue roles for VDC and non-VDC Nexus switches since they need different default User-Roles.
This was tested on a Nexus 5000, a Nexus 7000 and VDC on the same Nexus 7000. Different codes were tried. This was not a NX-OS bug.
Upon further investigation it was obvious, that the syntax above as provided by Cisco was specific their TACACS software, being the ACS software. But I still required multiple Roles to be assigned for my single TACACS configuration to work across multiple Nexus devices. First attempt was the lazy method. Ask uncle Google for any such encounters with a solution. That yielded no practical results. I then contacting Shrubbery for the solution, after that it became clear that possibly nobody else have experienced this problem before.
So the hunt began to find out exactly what was so different in the AAA response from the Cisco ACS software to the TACPLUS software that it did not yield the required results.
Continue reading “Cisco Nexus User Roles using TacPlus”
Memory problems on routers is nothing new. It is generally less of a problem in current day, but is still seen from time to time.
BGP is capable of handling large amount of routes and in comparison to other routing protocols, BGP can be a big memory hog. BGP peering devices, especially full internet peering devices, require larger amounts of memory to store all the BGP routes. Thus it’s not uncommon to see a BGP router run out of memory when a certain route count limit is exceeded.
A router running out of memory, commonly called Low Memory, is always a bad thing. The result of low memory problems may vary from the router crashing, to routing processes being shut down or if you lucky enough erratic behavior causing route flaps and instability in your network. None which is desired.
Low memory can be caused by any of the following:
- Partial physical memory failure.
- Software memory bugs.
- Applications not releasing used memory chunks.
- Incorrect configuration.
- Insufficient memory allocation to a Nexus VDC.
Continue reading “Low Memory Handling”
The Cisco Nexus Series platform has some good things going. Having spent much of my time recently using them, I have come to appreciate some very neat improvements NX-OS is offering over standard IOS. For the most part driving NX-OS is very similar to IOS, but it’s been greatly improved.
One such example is the output from the most used IOS command “show ip int brief”, which on NX-OS only shows ‘IP’ (being layer 3) interfaces. To see the brief state of all types of interfaces use “sh int brief” instead.
N5K-2(config)# sh ip int brief
IP Interface Status for VRF "default"(1)
Interface IP Address Interface Status
Vlan19 10.1.19.6 protocol-up/link-up/admin-up
Vlan22 10.1.22.6 protocol-up/link-up/admin-up
N5K-2(config)# sh int brief
Ethernet VLAN Type Mode Status Reason Speed Port
Interface Ch #
Eth1/1 1 eth trunk up none 1000(D) 51
Eth1/2 22 eth access up none 10G(D) -
Eth1/3 1 eth trunk down SFP not inserted 10G(D) 50
Eth1/4 1 eth trunk down SFP not inserted 10G(D) 50
Eth1/5 1 eth trunk down SFP not inserted 10G(D) -
Eth1/6 19 eth access down SFP not inserted 10G(D) -
Eth1/7 1 eth trunk down Link not connected 10G(D) 5
Eth1/8 1 eth trunk down Link not connected 10G(D) 5
Eth1/9 1 eth fabric down Administratively down 10G(D) 9
Eth1/10 1 eth fabric down FEX identity mismatch 10G(D) 7
Eth1/11 1 eth fabric down vpc peerlink is down 10G(D) 34
Eth1/12 1 eth fabric down SFP not inserted 10G(D) 12
Eth1/13 1 eth fabric up none 10G(D) 15
Eth1/14 1 eth fabric down Administratively down 10G(D) 9
Continue reading “Nexus’ improved CLI”
Ever configured a Nexus switch to use AAA to query a Tacacs+ server? Had some troubles applying standard IOS config to NX-OS?
Possibly if your Tacacs+ server is configured to only allow PAM (Password Authentication Manager) authentication for the users. See when a NX-OS switch sends a AAA authentication packet, by default it is encapsulated using PAP encoding. This is in contrast to normal IOS devices, that use PAM encoding by default.
To illustrate I used the following config:
ip tacacs source-interface mgmt0
tacacs-server host 10.5.0.82 key password
aaa group server tacacs+ TAC
aaa authentication login default group TAC
aaa authorization config-commands default group TAC
aaa authorization commands default group TAC
aaa accounting default group TAC
Continue reading “Nexus defaults to PAP authentication”
November last year, a pair of Cisco Nexus 5010 switches, suddenly started rebooting randomly without user intervention. Since these boxes were a front to a VM environment, stability were of urgent concern. But in order to stabilize the environment, the root cause of the reboots had to be isolated, and quickly.
The Cisco Nexus platform might not be as mature as many would like, but it is quickly becoming a very needed switch in Next-Generation datacenters. Of the things I like most about the Nexus boxes are the readily available local reporting and intuitive system checks. Obviously there are many other features which is making the platform so popular. I’ll cover some of these in time.
Coming back to the rebooting issue. Unlike IOS devices that looses all local logging info, unless a crash dump was saved to NVRAM, the Nexus writes most of its log information to disk. Thus even after the reboot, you have all the information.
Continue reading “Troubleshooting random Nexus reboots”