420 likes | 581 Views
Network Troubleshooting Tools. Kent Reuber ITS Networking Reuber@stanford.edu April 6, 2007. Outline. What problems do you need to solve? Tool descriptions Q&A time Tool descriptions are in the “Software” section of the LNA Guide: http://lnaguide/software.html. What are the problems?.
E N D
Network Troubleshooting Tools Kent Reuber ITS Networking Reuber@stanford.edu April 6, 2007
Outline • What problems do you need to solve? • Tool descriptions • Q&A time • Tool descriptions are in the “Software” section of the LNA Guide: http://lnaguide/software.html
What are the problems? • Are hosts online? (ping) • How do you get to hosts? (traceroute) • What are hosts running? (nmap) • Where/when have hosts been seen? (ipm) • “The network is slow” (Netspeed, iperf) • DHCP and DNS (SUNet reports) • Wireless problems (various) • Packet sniffing (wireshark), and batch NetDB changes (NetDB CLI)
Ping: Are you there? • Ping sends ICMP echo requests to a host and asks for a reply. Reply time is also returned. • Some hosts may choose not to reply by security policy. It may not mean that they’re down. • Stanford de-prioritizes pings at some of our borders, so a long ping time or dropped pings does not indicate a poor connection. • Stanford maintains a special host: • “ping-me.stanford.edu” • Exempt from ping filter. • Have outside users ping “ping-me” if they claim that connections to Stanford are unavailable or slow.
Ping for Advanced Users • Can increase packet size to see duplex errors. (Unix: ping -s) • Default small (<60 byte) ping packets don’t generate enough traffic to show duplex problems. • Try using pings of 1000+ bytes. • Use nmap or similar utility for “ping sweeps” of entire networks: • “nmap -sP <network range>” (Ex. “nmap -sP 171.64.18.0/23”) • Nmap: http://insecure.org/
Traceroute: How do I get there? • How traceroute works: • Source sends a series of packets with increasing time-to-lives. (TTL is the allowed number of router hops.) Unix/Mac: UDP, Windows “tracert”: ICMP. • Routers will decrement TTL and respond with an ICMP “unreachable” message if TTL is 0. • Like ping, a timestamp is returned.
Traceroute notes • Routers need not reply to traceroutes. Lack of a reply does not mean that the router is down. • Return traffic doesn’t necessarily use the same path. • This can cause problems with firewalls and packet shapers that assume they see the whole conversation. • When troubleshooting connection problems, you may want to have the destination send traceroutes to you as well.
Nmap: Scanning nets • In addition to ping scans, you can scan for open ports on hosts. • This can be useful for seeing who is running a service (intentionally or otherwise!) • My recipe for scanning for open TCP ports: • ”nmap -P0 -sT net -p ports -oG - | grep open”
Getting nmap • Download from http://insecure.org • Unix and MacOS X usually require compiling from source. • Windows binary available.
IPM: IP <-> MAC addresses • Stanford-specific utility • How it works: • Devices broadcast ARP packets when they need to communicate locally. • Routers see these ARP and cache it. • Information is periodically harvested and kept in a database. • Using IPM, you can track when an IP/MAC was first and last seen and where.
IPM: What’s it good for? • You can find MAC addresses which aren’t in Netdb. • Find out where a particular device has been seen. • See if multiple devices are using a single IP address.
More on IPM • Where is it: • AFS: /usr/pubsw/sbin/ipm • Note: this directory is not in your default PATH. • Using IPM: • Wildcards: “_” (single character), “%” (multiple characters) • Run “ipm -h” to see list of options.
MAC vendor codes • MAC addresses are 48-bit (6 bytes) xx:xx:xx:xx:xx:xx, where each “x” is a hexadecimal number 0-9,a-f. • First 3 bytes are the Organizationally Unique Identifier (OUI), which tell you who made the network card. • Can look this up. My favorite site: http://www.coffer.com/mac_find/ • Can tell you when NetDB records are outdated. For example, a NetDB record for a Macintosh with MAC address 00:0b:db (Dell) is clearly wrong.
Netspeed & Iperf: Speed testing • Often hear “the network is slow”. • Is it the client, the network or a server? • Where’s the bottleneck? • Useful tools: • Netspeed (Web based speed to campus backbone). • Iperf (command line tool for point-to-point).
Netspeed • Web based speed testing to Stanford backbone: http://netspeed.stanford.edu/ or http://iperf.stanford.edu/ • Useful for finding duplex errors (misconfigured hubs or switches) in the path.
Iperf • Command line testing tool. • Can also run speed tests against netspeed.stanford.edu and iperf.stanford.edu • Can be run in server mode for testing speed between arbitrary points (e.g., within your network) http://dast.nlanr.net/Projects/Iperf/
How fast can you go? • DSL: 1 Mbps (asymmetric) • 802.11b wireless: 1-5 Mbps • 802.11g wireless: 1-12 Mbps • Fast Ethernet: 80+ Mbps • Gigabit: ?? Note: consider these tests as upper bounds. For gigabit especially, you may not be able to transfer real data this fast.
Troubleshooting DHCP • Many things can go wrong. Problems are rarely caused by DHCP server unavailability. • Things to check: • What IP is the host getting? • Netdb record for the host. • DHCP server logs, roaming pool utilization reports.
Understanding DHCP • Stanford has two DHCP servers: dusk and dawn. • Info from Netdb is uploaded approximately every 15 minutes. Give Netdb the time to upload data. • At Stanford, MAC address information is required for successful DHCP. • Initial DHCP is a four step process using broadcasts; renews are different.
Leases • DHCP addresses are valid for a limited period (wired and wireless). • Normal DHCP: 2 days • Roaming DHCP: 42 minutes • Hosts will re-confirm their leases halfway through the lease period. • Clients use unicast directly to the DHCP server (clients have an address and they know who their server is). • Renew message type is used.
DHCP roaming • If the Netdb record has a “home” IP address appropriate for the network where the device is located, DHCP servers will send it. • Can have “home” IP addresses and still be able to roam to other networks. • Can have multiple “home” addresses bound to each MAC address. • If no appropriate address is entered, DHCP will look for available roaming addresses on the local network. • Number of roaming address is specified by the LNA. Defined in the Netdb network record. • Usually there are only a handful of roaming addresses. Can easily run out of them.
What address did you get? • The address received may tell you what the problem is. • Self assigned (169.254.*.*): • NetDB record not set up properly. • No roaming address available. • Routing or DHCP server problem (less likely). • 10.x.x.x: • Used by Network self-registration system. (SNSR) • Could also be used by a rogue. • 192.168.*.*: • Probably a rogue DHCP server.
Finding rogues • Try pinging the gateway that’s being distributed. • Use “arp” command to get the MAC address of the gateway. Or use a sniffer if you have one. • Look at switch MAC tables and find the offending hosts. Shut off the port or go have a “chat”. • New Net-to-Switch configs block rogue DHCP servers!
Available DHCP reports • DCHP logs for a given host. • Type in MAC address and see the conversation. • Takes practice to read. • Roaming address utilization • How many roaming addresses were used in a day. • DHCP reports from dusk and dawn • Hourly logs show number of DHCP messages for hosts. • “No free leases” may indicate that you’re out of roaming addresses. • All reports are linked from LNA Guide software section: http://lnaguide/software.html
DNS at Stanford • Host information is entered in NetDB • Uploads to DHCP servers about every 15 minutes. • Uploads to DNS servers about every hour. • Starts at 5 minutes after the hour. • Takes about 20 minutes. Should be done by 30 minutes past the hour. • Specific info on timing is kept in the NetDB help files.
DNS inspection tools • Standard: “host”, “nslookup”, “dig”. • Stanford whois can show you most NetDB information: • “whois -h whois.stanford.edu <query>” • Use “%” and “_” as wildcards as per ipm. • Great for people who need “read-only” access, since you don’t need a NetDB account. • For host names, you need to end query in a “.” or specify “.stanford.edu” so that whois knows you want information on a host.
Wireless problems • Wireless is slow or unavailable. • Reports can be vague. “Wireless is slow on the 2nd floor.” • Isolating the problem can speed resolution. • Exactly where is the problem occurring? • What access point is the user connecting to? • Do others have problem in the area?
Wireless tools • Access point association: • Mac: Internet Connect utility • PC: ?? • Access point discovery for seeing available AP’s and channels: NetStumbler, iStumbler • Iperf and Netspeed are useful for checking speed problems. • Often, a AP reboot will solve the problem. • AP jack (tso) information is in Netdb. • Can unplug and replug if necessary.
EtherPeek and Wireshark • Stanford has site license for Etherpeek, but it’s still expensive. • Wireshark (formerly Ethereal) is free. (Motto: “Sniff free or die!”) • X windows application for Unix/Mac. • Binary for Windows. • http://www.wireshark.org/ • Some books are available!
Advice on Sniffing • Need for a sniffer is rare, but invaluable when you need it. • Learn to use it before you need it! • You will need to set up special “span” ports on your switches to see all traffic. • No need if you’re interested in broadcasts and multicasts. • Most useful for seeing traffic entering and leaving your net.
NetDB CLI overview • Designed for power users. • Provides a subset of NetDB functionality (mostly nodes) for batch changes. New features are periodically added. • Use with caution. Try one or two hosts before doing big batches.
How to run NetDB CLI • Located in AFS space: • /usr/pubsw/sbin/netdb (note: this directory is probably not in your PATH) • Use -h option to get command syntax • Stuff you can do (to a single machine or list of machines): • Change administrators, locations. • Change IP addresses. • Delete nodes.