Troubleshooting
Resolutions and Tips
Section 1: Narrowing Down the Possibilities
Section 2: Problems and Resolutions
Narrowing Down the Possibilities:
Only do things that will narrow your search for the problem.
One of the biggest obstacles in troubleshooting is finding a way to narrow the lists of possible causes behind the chaos. Conserve
your energy and only do things that will give catagorical results, either it is the problem or it is not.
The best way to test a component is to put it into a known good environment. Do not put a known good component into a broken
environment.
Lets look at that a little closer. Suppose a user complains that she cannot read a file from a floppy disk that was working
yesterday. After eliminating a few things you figure it's the floppy drive. You want to confirm that. If you replace it with a known
good drive and things still do not work, you know that the original floppy drive is probably good, but thats about all. If we
put the suspect drive into another working machine and it works, we know the drive is not our problem. If it fails we know it
was (at least part of) the problem. By puting the suspect component into a known good environment we will get definitive
answers. That cannot be stressed enough.
However, before we can decide to test the floppy drive, we would have a few steps to on the way.
Look at what it could be and how to test these things:
- it could be the floppy disk or the file OR it could be the computer (including all components)
- we take the suspect disk and try to read it on another machine, preferably the one that created it.
- lets say it's not the disk. Now we have a PC, an I/O card possibly, a data cable, a power cable, and a floppy drive to check. (check the user too:)
- switch one thing at a time, trying it on the other system, and returning it if it works. I would start with the data cable,
that's pretty simple. The power cable could be tested by hooking that lead to another 12v device, and prefferable without adaptors.
- the I/O card (if applicable) or the floppy drive would be next. Take them to a working machine.
- if we still haven't found the bad part, we cast our eyes on the mainboard. But lets bend the rules and if there is a good I/O
card, try another slot. If the I/O is onboard, try to find a I/O card to use.
One Exception to taking the suspect part to the known good environment is Power Supplies. If a faulty power supplies fries your
mmainboard you don't want to plug it into another machine. Be careful testing power supplies. It is nice if you have a known good
junk board to use for testing.
Problems and Resolutions:
- Can't Send Email. It started taking about a minute to send. Recieving email is normal performance.
- After checking with 2 workstations, monitoring firewall with tcpdump, disableing GPG, and geting no-where, I sent mail to my work email then POP-d it down locally.
After studying the header I found the SMTP server had changed. Shaw did not need to inform anyone, as they expect users to have their email set to "mail" and to be in the @home workgroup/domain.
Because I am on a separate LAN, I must fully qualify the email servers (SMTP & POP). Once the new SMTP server address was put into me client settings, everything was back to normal.
- Set the Correct Time Zone
- Track what files an install adds or alters
- Before installing, run find /* > /tmp/filename1
- After installing, run find /* > /tmp/filename2
- to compare the files, run diff /tmp/filename1 /tmp/filename2 and maybe add > /tmp/filename3 to save to a file.
- Root can't access users HOME dir
- The default conf for NFS mounts it to enable "root_squashing" which means root cannot access (maybe read only) the NFS drive.
You can set it to either root_squash, no_root_squash or another option that changes everyone to a certain UID & GID like guest.
(maybe all_squash?) Check the man on exports.
- I use nfs for most of my data plus my /home. I edited the /etc/exports file on the remote nfs host and on my local machine edited /etc/updatedb.conf to allow nfs entries into the locate database.
This is on RH 7.1, I can't remember seeing this file before, it used to be just args.
- Note
- be careful with no_root_squash as if someone has root on one system, it gives them root on the nfs host, in the nfs exports.
- the updatedb could run forever if you include all the nfs mnt's on a large network. I just do that at home.
- The time is correct but Email is time stamped one hour ahead
-
This happened on a machine at work and didn't make sence at first.
Here is a bit of the email explaining it to the user:
On Thursday 14 June 2001 07:02 pm, you wrote:
> Pete -- this is strange. Both my outgoing message and the return to
> me show the time as 1:22 pm! CvS
>
It looks like you may not be set for daylight savings time. The summertime setting should be -7 UCT (PDT).
I suspect your clock is set to the correct display time, but set to -8 UCT (PST).
The receiving station converts the time to local-time, which is 1 hr ahead.
Here is a clip from the expanded email header
(is this from your machine?):
Date: Thu, 14 Jun 2001 18:02:33 -0800
This shows that Learn is set to -0700 (PDT):
by learn.etc.bc.ca (8.9.3/8.9.3) with SMTP id SAA25322
for <pnesbitt@openschool.bc.ca>; Thu, 14 Jun 2001 18:04:17 -0700 (PDT)
I'm at home but an email from work to home shows -7 in all date fields.
Let me know if that's it, and I'll confirm my workstation in the morning.
Pete
- Force Auto-Detect of Hardware after you change hardware
- This came up when kadzu (hardware detection utility) would no longer detect something after you put it back in.
- If you want to force kadzu to find something, you have to remove the file that references it and edit the control file.
check the dir /etc/sysconfig
- to make it detect your mouse from scratch:
- open the file "hwconf" and remove the section called "mouse"
- delete the file called "mouse" from that dir
- reboot
- DHCP keeps dropping IP Address
- This is another email about a firewall issue I had:
- Well, I think I finally found my dropped ip address problem. I also discovered a cool command, pumpd, as in "pump -i eth0 --status"
Anyway after lots of scutinizing logs and chain rules, it looked like my dhcp lease could be set, but not renewed. Still unsure why that was.
- I only allowed my machine to talk to broadcasts of DHCP, however, that is how it starts, but then they speak directly.
- this is that I had:
# allow DHCP conversations.bootps=67=dhcp_server, bootpc=68=dhcp_client
# (book didn't have "no connect", added it 29-01-01)
ipchains -A input -i $EXIF -p udp -s 0/0 bootps -d 255.255.255.255 bootpc -j ACCEPT
ipchains -A input -i $EXIF -p tcp ! -y -s 0/0 bootps -d 255.255.255.255 bootpc -j ACCEPT
ipchains -A output -i $EXIF -p udp -s 0/0 bootpc -d 255.255.255.255 bootps -j ACCEPT
ipchains -A output -i $EXIF -p tcp -s 0/0 bootpc -d 255.255.255.255 bootps -j ACCEPT
- this is what I added:
# book didn't have "any" destination,
# just dest "broadcast", added -d 0/0 30-01-01
# added this to try to find why lease not renewing.
ipchains -A output -i $EXIF -p udp -s $EXIP bootpc -d 0/0 bootps -j ACCEPT
ipchains -A output -i $EXIF -p tcp -s $EXIP bootpc -d 0/0 bootps -j ACCEPT
ipchains -A input -i $EXIF -p udp -s 0/0 bootps -d $EXIP bootpc -j ACCEPT
ipchains -A input -i $EXIF -p tcp ! -y -s 0/0 bootps -d $EXIP bootpc -j ACCEPT
# end of DHCP client
- It does appear a little loose security wise, I could be attacked through port 68 I suppose. Maybe if I restrict the lease conversation to shaw's subnets?
- If ping times out, or has a very long delay before proceeding
- Try ping -n xxx.xxx.xxx.xxx. Even when you ping an IP address, ping still does name resolution. If ping works fine with the -n option, then there is a problem related to DNS resolution.
- ping reports "Warning: time of day goes back, taking countermeasures."
- I am not familiar with the new ping command under Red Hat 7.1,
but I had to add a -U to ping (man page:"Print true user-to-user latency (the old behaviour)")
to avoid having the above error line inserted in the results. The message is distracting, confusing and can mess up scripts.
original document created by Pete Nesbitt, August 2001