http://en.wikipedia.org/wiki/Machine_Check_Exception
Decoding MCEs
As noted previously,
decoding MCE errors can prove difficult.
Normally the manufacturer (especially processor manufacturers) will be
able to provide information about specific codes. Consult the Intel 64
and IA-32 Architectures Software Developer's Manual[2]
Chapter 15 (Machine-Check Architecture), or the Microsoft KB Article on
Windows Exceptions[3].
[edit] Programs to Decode
MCEs
- mcat
- A Windows command-line
program from AMD to decode MCEs from AMD K8, Family 0x10
and 0x11 processors
- mcelog
- A Linux
program by Andi Kleen to decode MCEs from x86-64
processors
- parsemce
- A Linux
program by Dave Jones to decode MCEs from AMD K7 processors
- mced
- A Linux
program by Tim Hockin to gather MCEs from the kernel and alert
interested applications. The primary difference between this app and
others is that this is a daemon (it is always running) which means that
it can get MCE notifications as soon as the kernel finds them. It does
not try to interpret the MCE data, just alert other apps.
From
Wikipedia, the free encyclopedia
A Machine Check
Exception (MCE) is a type of computer hardware error that
occurs when a computer's central processing unit detects a
hardware problem.
Microsoft Windows displays the error
using the blue screen of death
containing the error message (the parameters inside the brackets vary):
STOP: 0x0000009C (0x00000004, 0x00000000, 0xB2000000, 0x00020151) "MACHINE_CHECK_EXCEPTION"
On Linux, a
process (such as klogd[1]
) writes a message to the kernel log and/or the console screen (usually
only to the console when the error is non-recoverable and the machine
crashes as a result):
CPU 0: Machine Check Exception: 0000000000000004
Bank 2: f200200000000863
Kernel panic: CPU context corrupt
The error usually occurs
due to failure or overstressing of hardware
components where the error cannot be more specifically identified with
a different error message. Diagnosing the error message can be
difficult, although Intel Pentium processors do generate more
specific codes which can be decoded by contacting the manufacturer.
MCEs require a restart of
the system before users can continue
normal operation: they often indicate a long-term problem of a general
nature.
[edit] Problem types
Most of these errors
relate specifically to the Pentium
processor family. Similar errors may occur on other processors and will
cause similar problems.
Some of the main hardware
problems that cause MCEs include:
- System bus errors (error communicating between
the processor and the motherboard).
- Memory errors
that may include parity / Error correction code
(ECC) problems. Error checking ensures that data is stored correctly in
the RAM; if information is corrupted, then random errors occur.
- Cache
errors in the processor; the cache stores important data and code. If
this is corrupted, errors often occur.
[edit] Causes
Normal causes for MCE
errors include overheating and/or incorrect
hardware installation. Some specific manually induced causes could
include:
- Overclocking
(naturally increases heat output)
- Poorly fitted heatsink/computer
fans (the same problem can happen with excessive dust in the CPU
fan)
- An overloaded internal
or external power supply, which can be fixed by upgrading.
Computer software can also cause errors in this way
(normally by corrupting data they are reading or writing). For example:
- Software performing
read or write operations to non-existent memory
regions which leads to confusion for the processor and/or the system
bus.
[edit] Decoding MCEs
As noted previously,
decoding MCE errors can prove difficult.
Normally the manufacturer (especially processor manufacturers) will be
able to provide information about specific codes. Consult the Intel 64
and IA-32 Architectures Software Developer's Manual[2]
Chapter 15 (Machine-Check Architecture), or the Microsoft KB Article on
Windows Exceptions[3].
[edit] Programs to Decode
MCEs
- mcat
- A Windows command-line
program from AMD to decode MCEs from AMD K8, Family 0x10
and 0x11 processors
- mcelog
- A Linux
program by Andi Kleen to decode MCEs from x86-64
processors
- parsemce
- A Linux
program by Dave Jones to decode MCEs from AMD K7 processors
- mced
- A Linux
program by Tim Hockin to gather MCEs from the kernel and alert
interested applications. The primary difference between this app and
others is that this is a daemon (it is always running) which means that
it can get MCE notifications as soon as the kernel finds them. It does
not try to interpret the MCE data, just alert other apps.
[edit] References
[edit] External links