I will say this simply and clearly so that it cannot be misunderstood:
STANDARD PING SHOULD NOT BE USED TO PROVE THE FOLLOWING:
- ROUTING PROBLEMS
- LATENCY
- PACKET LOSS
8 REASONS WHY YOU CANNOT RELY SOLELY ON ICMP PING
- ROUTING: PING IS END TO END. Therefore
standard ping, by itself, cannot be used to prove there is a routing problem.
PING reveals nothing regarding the intermediate devices. ICMP is part of IP which means it is unaware of the switches, bridges or hubs in the path to the destination, all of which have their own propagation delay (read 'latency'). Some PING implementations contain a record route function, but record route (ping -r) only stores 9 hops. The traceroute tool does more and is a much better tool for troubleshooting routing because it utilizes a very limited amount of UDP traffic.
- Routing, Latency, Packet Loss: Platform differences. PC's running
MS-Windows,
Unix,
and routers
all handle ICMP and
PING packets differently.
This difference between platforms introduces delays that does not occur with
ordinary TCP or UDP
data. TCP and UDP
are treated much more uniformly between the above platforms. PING
is an ancient tool. PING
was written for an environment where all hosts are the same platform type
and all hosts use a common LAN
protocol. PING expects
all hosts to handle ICMP
identically. On your LAN and
on the Internet, they do not. Because
PING shows round
trip results, you have no way to know which device or wire in the path is
at fault for your problem.
- Routing, Latency, Packet Loss: PING
does not, BY ITSELF, identify the host causing the problem. There
are cases where a failed PING
is a normal response. If you PING
www.yahoo.com and you think you see a problem at Yahoo, you have no way to
know what the cause is without having foreknowledge of how the website and
the entire Internet path between you and Yahoo is constructed and configured.
Always run additional tests such as traceroute
(all platforms), pathping (MS-Windows only)
or pathchar. A device or link in the middle
of the path between you and Yahoo might be failing or over-utilized, making
it appear that Yahoo is dropping packets when they are not. It's
also possible the network is working perfectly and Yahoo really IS dropping
all the packets. As I have repeatedly said, with PING,
you have no way to know.
DON'T BOTHER YOUR ISP OR ADMINISTRATOR unless you have the results of more tests than just five or ten PING responses.
- LATENCY/LOSS: Queuing and QoS.
Routers can implement queuing strategies, forcing them to handle ICMP differently from TCP and UDP. This queuing causes them to behave in a way that is contrary to the specifications for ICMP, thereby invalidating any results PING (which is an ICMP service) might generate. Devices providing Quality of Service functions (switches, routers or servers) may also handle ICMP in a way that differs from the Internet Standards and specifications in order to optimize availability for TCP and UDP traffic. A QoS device might be programmed to drop 80% of all ICMP regardless of how much TCP or UDP traffic there is currently.
- LATENCY/LOSS: RATE LIMITS. A host may have
an artificial rate limit, or access-list imposed to reduce the effect of a
possible future denial of service attack. This will artificially drop
only the ICMP packets,
and leave the TCP and
UDP packets untouched.
TCP and UDP
flows will be unaffected, i.e. 100% of the TCP
and UDP packets will
still get through, even though there is 100% loss seen with ICMP.
- LATENCY/LOSS: BASELINE DEPENDENCIES
PING return time results have no meaning unless there is prior performance data to compare it to. Most network administrators fail to do a 24-hour baseline performance evaluation of all points on their own network before they buy bandwidth from their ISP and also fail to perform a complete second analysis after the upgrades are installed. Most simply run a ping from their desktop PC and if the round trip time to their favorite strokeoff-site is greater than the ISP's service level agreement, they go ballistic. First, if you don't know what the performance of your own network was originally, how do you know what is normal when a problem occurs? What basis would you have for determining whether your bandwidth upgrades improved network performance? Second, why were you stupid enough to run your PING from a host deep inside your own network behind who knows how many bottlenecks in your own LAN? Do an initial baseline from all points on your network and follow up with performance monitoring using a reliable tool. This gives you the ability to do trending on your own network performance and on your Internet connection's performance. Last, always run your post-installation check from the outside edge of your network, preferably from the router that serves as your Internet gateway. If your Internet router can't reach Yahoo or Google in less than the number of milliseconds listed in your ISP's service level agreement, then call them on the phone and start whining like a baby. Otherwise, suck it up. You should have read the service level agreement before you signed the contract. It's not unusual to see ISP's service level agreements quote 125ms one-way times (250ms round trip!) from the core of their network to all U.S. domestic locations.
- LATENCY/LOSS: Local Network Issues
Momentary 'glitches' in performance are normal occurrences on every network. This is yet another reason for performing extensive baselining and performing extensive testing before reporting a problem. In networks running OSPF, the entire network experiences latency every time the update timer ticks down to zero and the network is flooded with a large number of OSPF updates. Good baselining and network planning will help to avoid this, but keep in mind that PING can do nothing to identify the source of the OSPF problem because the traffic is coming from all routers on the network and the pause is also caused by the routers updating their routing tables, not solely the high traffic loads. Any PING run in that situation will get totally random, unpredictable and therefore useless results. Again, PING really shouldn't be used for latency.
- LATENCY/LOSS: Bottlenecks (Politics and bad
network design)
It is very common for peering links between Internet providers to be overutilized and cause a bottleneck. This is caused by the political problem of ISP's wrangling over who is the bigger ISP and who pays whom for the peering connection. When each provider's engineers incorrectly use PING to diagnose the problem the peering link appears congested in the other guy's side. ISP's like their engineers dumb and keep them that way to prevent them from finding a less stressful (and better paying) job. Thus, with the lack of training, poor tools, even poorer support, the net result is that both ISP's point the finger at each other leaving the poor customer in the middle. If the customer of either ISP attempts to figure out who the actual upstream Internet provider is (who is the big guy) both ISP's will carp "Sorry, non-disclosure' and immediately clam up. If the peering link is congested, there is no hardware failure, there is nothing to be repaired. All equipment is functioning normally, there simply isn't enough capacity and one of the ISP's is too stubborn to buy more bandwidth. To summarize: a bad PING result at the peering point between ISP's is useless in this case and won't get anything fixed until the politics get resolved.
SUMMARY:
You can never tell if any of items above applies to your situation, or what their effects might be, therefore any results you might get from ICMP PING are always suspect, and cannot be trusted. You cannot rely on a single run of 4 PINGs from a MS-DOS prompt as the sole and absolute proof of real-time packet loss, latency or routing issues. FURTHERMORE, if you have not done extensive baselining under NORMAL conditions, you have no basis of comparison, rendering any results you might obtain useless. |
Unfortunately, PING is one of the few tools that is available on all platforms. It's the old hammer and nail problem. Since all computers come with PING, most of the less knowledgeable computer folks resort to using PING because it's the only tool they have or in most cases, know how to use.
Also, just knowing how to run a PING does not guarantee that you will understand the results it reports back. The feedback from PING is deceptively simple.
You want good performance data? Want reliable information on uptime and availability for services and devices? Get a decent network monitoring package that monitors services, utilizes SNMP and RMON and doesn't rely on PING. Make sure it includes a UDP and TCP performance and throughput tool. Use the throughput tool to determine your ACTUAL loss or latency.
--InetDaemon