diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2009-06-13 13:14:51 -0700 | 
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2009-06-13 13:14:51 -0700 | 
| commit | a2ee2981ae2a7046b10980feae9f4ab813877106 (patch) | |
| tree | ed75db7830b9ef1342659d36d2775954ce96b79f /Documentation | |
| parent | 7603ef03a22a33d36d3c75d7c1aca1f957671ad3 (diff) | |
| parent | 0d5959723e1db3fd7323c198a50c16cecf96c7a9 (diff) | |
Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (80 commits)
  x86, mce: Add boot options for corrected errors
  x86, mce: Fix mce printing
  x86, mce: fix for mce counters
  x86, mce: support action-optional machine checks
  x86, mce: define MCE_VECTOR
  x86, mce: rename mce_notify_user to mce_notify_irq
  x86: fix panic with interrupts off (needed for MCE)
  x86, mce: export MCE severities coverage via debugfs
  x86, mce: implement new status bits
  x86, mce: print header/footer only once for multiple MCEs
  x86, mce: default to panic timeout for machine checks
  x86, mce: improve mce_get_rip
  x86, mce: make non Monarch panic message "Fatal machine check" too
  x86, mce: switch x86 machine check handler to Monarch election.
  x86, mce: implement panic synchronization
  x86, mce: implement bootstrapping for machine check wakeups
  x86, mce: check early in exception handler if panic is needed
  x86, mce: add table driven machine check grading
  x86, mce: remove TSC print heuristic
  x86, mce: log corrected errors when panicing
  ...
Diffstat (limited to 'Documentation')
| -rw-r--r-- | Documentation/Changes | 15 | ||||
| -rw-r--r-- | Documentation/feature-removal-schedule.txt | 10 | ||||
| -rw-r--r-- | Documentation/x86/x86_64/boot-options.txt | 44 | ||||
| -rw-r--r-- | Documentation/x86/x86_64/machinecheck | 8 | 
4 files changed, 69 insertions, 8 deletions
| diff --git a/Documentation/Changes b/Documentation/Changes index b95082be4d5..d21b3b5aa54 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -48,6 +48,7 @@ o  procps                 3.2.0                   # ps --version  o  oprofile               0.9                     # oprofiled --version  o  udev                   081                     # udevinfo -V  o  grub                   0.93                    # grub --version +o  mcelog		  0.6  Kernel compilation  ================== @@ -276,6 +277,16 @@ before running exportfs or mountd.  It is recommended that all NFS  services be protected from the internet-at-large by a firewall where  that is possible. +mcelog +------ + +In Linux 2.6.31+ the i386 kernel needs to run the mcelog utility +as a regular cronjob similar to the x86-64 kernel to process and log +machine check events when CONFIG_X86_NEW_MCE is enabled. Machine check +events are errors reported by the CPU. Processing them is strongly encouraged. +All x86-64 kernels since 2.6.4 require the mcelog utility to +process machine checks. +  Getting updated software  ======================== @@ -365,6 +376,10 @@ FUSE  ----  o <http://sourceforge.net/projects/fuse> +mcelog +------ +o <ftp://ftp.kernel.org/pub/linux/utils/cpu/mce/mcelog/> +  Networking  ********** diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index de491a3e231..ec9ef5d0d7b 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -437,3 +437,13 @@ Why:	Superseded by tdfxfb. I2C/DDC support used to live in a separate  	driver but this caused driver conflicts.  Who:	Jean Delvare <khali@linux-fr.org>  	Krzysztof Helt <krzysztof.h1@wp.pl> + +---------------------------- + +What:	CONFIG_X86_OLD_MCE +When:	2.6.32 +Why:	Remove the old legacy 32bit machine check code. This has been +	superseded by the newer machine check code from the 64bit port, +	but the old version has been kept around for easier testing. Note this +	doesn't impact the old P5 and WinChip machine check handlers. +Who:	Andi Kleen <andi@firstfloor.org> diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index 2db5893d6c9..29a6ff8bc7d 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -5,21 +5,51 @@ only the AMD64 specific ones are listed here.  Machine check -   mce=off disable machine check -   mce=bootlog Enable logging of machine checks left over from booting. -               Disabled by default on AMD because some BIOS leave bogus ones. -               If your BIOS doesn't do that it's a good idea to enable though -               to make sure you log even machine check events that result -               in a reboot. On Intel systems it is enabled by default. +   Please see Documentation/x86/x86_64/machinecheck for sysfs runtime tunables. + +   mce=off +		Disable machine check +   mce=no_cmci +		Disable CMCI(Corrected Machine Check Interrupt) that +		Intel processor supports.  Usually this disablement is +		not recommended, but it might be handy if your hardware +		is misbehaving. +		Note that you'll get more problems without CMCI than with +		due to the shared banks, i.e. you might get duplicated +		error logs. +   mce=dont_log_ce +		Don't make logs for corrected errors.  All events reported +		as corrected are silently cleared by OS. +		This option will be useful if you have no interest in any +		of corrected errors. +   mce=ignore_ce +		Disable features for corrected errors, e.g. polling timer +		and CMCI.  All events reported as corrected are not cleared +		by OS and remained in its error banks. +		Usually this disablement is not recommended, however if +		there is an agent checking/clearing corrected errors +		(e.g. BIOS or hardware monitoring applications), conflicting +		with OS's error handling, and you cannot deactivate the agent, +		then this option will be a help. +   mce=bootlog +		Enable logging of machine checks left over from booting. +		Disabled by default on AMD because some BIOS leave bogus ones. +		If your BIOS doesn't do that it's a good idea to enable though +		to make sure you log even machine check events that result +		in a reboot. On Intel systems it is enabled by default.     mce=nobootlog  		Disable boot machine check logging. -   mce=tolerancelevel (number) +   mce=tolerancelevel[,monarchtimeout] (number,number) +		tolerance levels:  		0: always panic on uncorrected errors, log corrected errors  		1: panic or SIGBUS on uncorrected errors, log corrected errors  		2: SIGBUS or log uncorrected errors, log corrected errors  		3: never panic or SIGBUS, log all errors (for testing only)  		Default is 1  		Can be also set using sysfs which is preferable. +		monarchtimeout: +		Sets the time in us to wait for other CPUs on machine checks. 0 +		to disable.     nomce (for compatibility with i386): same as mce=off diff --git a/Documentation/x86/x86_64/machinecheck b/Documentation/x86/x86_64/machinecheck index a05e58e7b15..b1fb3027328 100644 --- a/Documentation/x86/x86_64/machinecheck +++ b/Documentation/x86/x86_64/machinecheck @@ -41,7 +41,9 @@ check_interval  	the polling interval.  When the poller stops finding MCEs, it  	triggers an exponential backoff (poll less often) on the polling  	interval. The check_interval variable is both the initial and -	maximum polling interval. +	maximum polling interval. 0 means no polling for corrected machine +	check errors (but some corrected errors might be still reported +	in other ways)  tolerant  	Tolerance level. When a machine check exception occurs for a non @@ -67,6 +69,10 @@ trigger  	Program to run when a machine check event is detected.  	This is an alternative to running mcelog regularly from cron  	and allows to detect events faster. +monarch_timeout +	How long to wait for the other CPUs to machine check too on a +	exception. 0 to disable waiting for other CPUs. +	Unit: us  TBD document entries for AMD threshold interrupt configuration | 
