Welcome to Infoblox NetMRI Community Sign in | Join | Help
in Search

Applied Infrastructure

Handling NMS Performance Data, Part 4

 In the previous posts, I've described how an NMS can optimize data retrieval and storage.  In this post, the last on the topic, I'll discuss several additional optimizations that can be implemented.

Request Packing
Some devices can handle more SNMP variable requests per packet than other devices.  It is easy for an NMS to take the approach of using the least common request size.  But that approach is inefficient for the devices that can handle the maximum requests per packet.  For example, if a high-end router can handle 40 requests per packet and a old, low-end switch can only handle 10 requests per packet, it is very inefficient for the NMS to use 10 requests per packet for the high-end device.  The NMS should automatically track the number of requests that a device can handle per packet.  Some NMS implementations require that the administrator make this change, but that's also inefficient, because it is the NMS that really knows how many requests can be packed per packet, based on devices that return all the data that was requested.  When the NMS doesn't receive the full set of requested data, it can do a binary search to quickly determine how much to request in a packet.  Another thing that the NMS has in its favor is that the MIB provides hints on the size of the returned data.  For example, it can determine that the request is for an octet string that has a maximum size of 512 bytes.  By implementing an efficient request stream, it is possible to make much more effective use of network bandwidth and request queue processing.

Collect Stats from Active Interfaces
Optimize data collection by collecting data from active interfaces and just checking for state change on inactive interfaces.  In networks that have a large number of switches, it is common for many of the switch ports to be inactive.  In some networks, the ratio of inactive to active interfaces may be 1:3 or more.  The NMS can easily identify those interfaces that are operationally down (e.g., up/down or down/down).  Each interface has either a "last change" timestamp or the operational state variable that the NMS can efficiently check on each polling cycle.  If the interface state has not changed, there is no need to retrieve the interface performance stats.  I prefer using the "last change" timestamp because it can tell whether the interface changed state since the last poll.

Rank Interfaces by Importance
Some interfaces are more important than other interfaces.  Many network management systems require that the administrator identify important interfaces or rank interface importance -- a manually intensive process.  This is something that the NMS can help perform, with the administrator performing additional refinement of the ranking.  An interface importance ranking design might have the following rankings:

1) Core infrastructure interfaces (core device to core device)
2) Core layer interfaces to distribution layer devices
3) Core infrastructure interfaces to data center infrastructure
4) Data center infrastructure interfaces to critical servers
5) High volume interfaces (typically a server)
6) Interfaces to critical services (NTP, DNS, DHCP, NMS)
7) Distribution layer interfaces to access layer devices
8) Access layer interfaces

Critical interfaces can be identified by determining device connectivity.  CDP, LLDP, Layer 3 addressing, spanning tree tables, switch trunk interfaces, and switch forwarding tables are good sources of data that the NMS can use to automatically rank the interfaces. 

Critical interfaces should be polled more frequently than edge interfaces that connect to a lower ranked device like a laptop.   The administrator can provide hints, such as the CIDR blocks in the data centers, or subnets where critical services or critical users are connected.

Use Variable Polling Periods
The vast majority of interfaces in a network are typically near the bottom of the ranking list and can be polled much less frequently than the higher ranked interfaces.  Polling a low utilization interface at a low frequency (e.g., every 15 or 20 minutes) will still provide good visibility into network problems.  A high ranking interface may need its stats collected every few minutes while an edge port to a user workstation may be polled a few times every hour.  Frequent polling of high ranking interfaces provides better real-time visibility into network problems and bursty traffic loads.  The error stats collected from the low ranking interfaces still provide good visibility into whether they are experiencing problems (e.g. duplex mismatch).  

A variable interface polling frequency allows the NMS to efficiently handle more interfaces than if all interfaces were polled at the same frequency.  It also allows more important interfaces to be monitored more closely.  Of course, the administrator needs to be able to change the polling frequency on any interface, and some interfaces may need to be polled as often as every few seconds when collecting troubleshooting information.

  -Terry

Comments

No Comments

About tslattery

Terry Slattery, CCIE #1026, is a senior network engineer with decades of experience in the internetworking industry. Prior to joining Chesapeake NetCraftsmen as a full time consultant, Terry was the founder and CTO of Netcordia, and inventor of NetMRI, a suite of network management products. Terry started Netcordia as a consulting company in 2000 and transitioned to a network management product company in 2003. During the consulting days, he used his network design and implementation skills to lead a team in the design and implementation of a high availability network at a brokerage clearing house. Terry is the former President and founder of Chesapeake Computer Consultants, Inc., a networking and computer systems training and consulting company. He co-invented and patented the vLab(tm) internet-based remote lab system. He is co-author of the McGraw Hill text Advanced IP Routing in Cisco Networks. Terry led the team that developed the current Cisco IOS user interface under contract to Cisco Systems. Terry is experienced in the design and installation of large TCP/IP based networks and is a successful network protocol instructor. He is the second Cisco Certified Internetworking Expert (CCIE) #1026 and the first outside of Cisco. He enjoys membership on the Vanderbilt University Engineering School’s Industrial Advisory Board and the IEEE.

This Blog

Syndication