Wireles Networking is a practical guide to planning and building low-cost telecommunications infrastructure. See the editorial for more information....



Troubleshooting

What do you do when the network breaks? If you can't access a web page or email server, and clicking the reload button doesn't fix the problem, then you'll need to be able to isolate the exact location of the problem. These tools will help you to determine just where a connection problem exists.

  • ping. Just about every operating system (including Windows, Mac OS X, and of course Linux and BSD) includes a version of the ping utility. It uses ICMP packets to attempt to contact a specified host, and tells you how long it takes to get a response.

    Knowing what to ping is just as important as knowing how to ping. If you find that you cannot connect to a particular service in your web browser (say, http://yahoo.com/), you could try to ping it:

    $ ping yahoo.comPING yahoo.com (66.94.234.13): 56 data bytes
    64 bytes from 66.94.234.13: icmp_seq=0 ttl=57 time=29.375 ms
    64 bytes from 66.94.234.13: icmp_seq=1 ttl=56 time=35.467 ms
    64 bytes from 66.94.234.13: icmp_seq=2 ttl=56 time=34.158 ms
    ^C
    --- yahoo.com ping statistics --
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max/stddev = 29.375/33.000/35.467/2.618 ms
    

    Hit control-C when you are finished collecting data. If packets take a long time to come back, there may be network congestion. If return ping packets have an unusally low ttl, you may have routing problems between your machine and the remote end. But what if the ping doesn't return any data at all? If you are pinging a name instead of an IP address, you may be running into DNS problems.

    Try pinging an IP address on the Internet. If you can't reach it, it's a good idea to see if you can ping your default router:

    $ ping 216.231.38.1
    PING 216.231.38.1 (216.231.38.1): 56 data bytes
    64 bytes from 216.231.38.1: icmp_seq=0 ttl=126 time=12.991 ms
    64 bytes from 216.231.38.1: icmp_seq=1 ttl=126 time=14.869 ms
    64 bytes from 216.231.38.1: icmp_seq=2 ttl=126 time=13.897 ms
    ^C
    --- 216.231.38.1 ping statistics --
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max/stddev = 12.991/13.919/14.869/0.767 ms
    

    If you can't ping your default router, then chances are you won't be able to get to the Internet either. If you can't even ping other IP addresses on your local LAN, then it's time to check your connection. If you're using Ethernet, is it plugged in? If you're using wireless, are you connected to the proper wireless network, and is it in range?

    Network debugging with ping is a bit of an art, but it is useful to learn. Since you will likely find ping on just about any machine you will work on, it's a good idea to learn how to use it well.

  • traceroute and mtr (www.bitwizard.nl/mtr). As with ping, traceroute is found on most operating systems (it's called tracert in some versions of Microsoft Windows). By running traceroute, you can find the location of problems between your computer and any point on the Internet:

    $ traceroute -n google.comtraceroute to google.com
             (72.14.207.99), 64 hops max, 40 byte packets
    1 10.15.6.1 4.322 ms 1.763 ms 1.731 ms
    2 216.231.38.1 36.187 ms 14.648 ms 13.561 ms
    3 69.17.83.233 14.197 ms 13.256 ms 13.267 ms
    4 69.17.83.150 32.478 ms 29.545 ms 27.494 ms
    5 198.32.176.31 40.788 ms 28.160 ms 28.115 ms
    6 66.249.94.14 28.601 ms 29.913 ms 28.811 ms
    7 172.16.236.8 2328.809 ms 2528.944 ms 2428.719 ms
    8 * * *
    
    The -n switch tells traceroute not to bother resolving names in DNS, and makes the trace run more quickly. You can see that at hop seven, the round trip time shoots up to more than two seconds, while packets seem to be discarded at hop eight. This might indicate a problem at that point in the network. If this part of the network is in your control, it might be worth starting your troubleshooting effort there.

    My TraceRoute (mtr) is a handy program that combines ping and traceroute into a single tool. By running mtr, you can get an ongoing average of latency and packet loss to a single host, instead of the momentary snapshot that ping and traceroute provide.

    My traceroute [v0.69]
    tesla.rob.swn (0.0.0.0) (tos=0x0 psize=64 bitpatSun Jan 8 20:01:26 2006
    Keys: Help Display mode Restart statistics Order of fields quit
    
                                         Packets          Pings
    Host                              Loss% Snt Last  Avg Best Wrst StDev
    gremlin.rob.swn                   0.0%    4  1.9  2.0  1.7  2.6  0.4
    er1.sea1.speakeasy.net            0.0%    4 15.5 14.0 12.7 15.5  1.3
    220.ge-0-1-0.cr2.sea1.speakeasy.  0.0%    4 11.0 11.7 10.7 14.0  1.6
    fe-0-3-0.cr2.sfo1.speakeasy.net   0.0%    4 36.0 34.7 28.7 38.1  4.1
    bas1-m.pao.yahoo.com              0.0%    4 27.9 29.6 27.9 33.0  2.4
    so-1-1-0.pat1.dce.yahoo.com       0.0%    4 89.7 91.0 89.7 93.0  1.4
    ae1.p400.msr1.dcn.yahoo.com       0.0%    4 91.2 93.1 90.8 99.2  4.1
    ge5-2.bas1-m.dcn.yahoo.com        0.0%    4 89.3 91.0 89.3 93.4  1.9
    w2.rc.vip.dcn.yahoo.com           0.0%    3 91.2 93.1 90.8 99.2  4.1
    
    The data will be continuously updated and averaged over time. As with ping, you should hit control-C when you are finished looking at the data. Note that you must have root privileges to run mtr.

    While these tools will not revel precisely what is wrong with the network, they can give you enough information to know where to continue troubleshooting.




Last Update: 2007-01-25