Also see Support and Debugging
Also see, of course, the category for the problem. For instance, the "memory" page has some info on testing memory.
Misc:
Troubleshooting mysterious X or OS crashes:
Usually, these problems are related to broken hardware and most tips below are related to that, but another
problem which should be considered is an X server which doesn't correctly support the video hardware. X servers
are very complex and often contain a few bugs, especially for recently released hardware. Crashes or hangups
of X servers often disable the keyboard and appear to be a problem with the basic OS (ie, the kernel).
Some things to investigate or try (in no particular order):
-
Remove the sound card. Not a likely suspect but easy to do and you never know ...
-
Interrupt/DMA/IRQ/IO-memory conflicts or flaky IO cards: Remove as many cards as possible and if
problem goes away, put them back in one at a time with testing between.
-
Memory: If possible, try running with part of your memory removed or replaced. Borrow some, if necessaary for test.
Swap out each old chip, half, fourth, one at a time.
-
Memory: Try the "memtest" program available from Sunsite.
-
Timing problems: Find your motherboard book and set the CPU clock to a slower value. Maybe
your CPU is a re-marked one. BTW you aren't over-clocking the CPU are you!??
-
BIOS timing: You can try running the memory with more wait states or their modern equivalent.
-
Try heavy computation and disk IO with and without X running to stress (heat & timing) the system.
Try running a cpu heavy job like a kernel compilation while md5sum-ing your entire disk.
-
Bad HW: See if you can borrow another video card, like a plain S3 Virge or something.
-
Memory: Try cutting back your resolution/bpp. If that fixes the problem,
then there's a decent chance it's a bad memory chip on your video card.
-
Memory: Cart all your system memory down to the local computer store and have them test it for you.
Most computer stores will do this either for free or for a nominal fee --
depends on how good a customer you are.
-
Contacts bad: Remove and replace your CPU, cards, powersupply connector, and ribbon connectors.
-
Contacts bad: Go to your local electronics supply and get 'contact cleaner spray'. The
particular kind they carry changes about three times a week depending on what the EPA says. If
you can't find any, you might try rubbing alcohol. Once you get something, take all your cards and
connectors off your motherboard, clean them thouroughly, clean the connectors on your
motherboard thouroughly, make sure to let them dry completely, then reassemble. I've seen
this fix a LOT of weird problems (like 9 out of 10 where all else has failed).
-
Heat: Check that powersupply fan and CPU fan are running and are not clogged with dust, etc.
-
Heat: Run with case off and big fan blowing into case. It is sometimes possible that running case-off
without the big fan will increase a local heat problem.
-
Power Supply failure or overloading: Put voltage meter probes into the back of the PS connectors
while the computer is running (and maybe while doing a "find / >/dev/null" to get a disk working).
Verify that the voltages (+/- 5, and 12 at least) are correct to +/- 10% (?).
Kernel Panic:
From Usenet (edited):
> kernel panic: VFS: Unable to mount root fs on 3:01
> What is VFS and 3:01? Can someone guess what I did wrong to
Kernel's Virtual File System is unable to mount /dev/hda1 at "/".
You can find out what the 3:01 by using "ls -l /dev":
brw-rw---- 1 root disk 3, 1 May 5 1998 /dev/hda1
You can see there the numbers 3 and 1. The 3 is major number
of device, and 1 is minor number. It doesn't matter what
the device is called (hda1 in this case). Kernel uses those
numbers to find out what the device really is.
Last Modified 25-Jan-1999
End of page.