summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2012-05-19Merge branch 'x86/ld-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 linker bug workarounds from Peter Anvin. GNU ld-2.22.52.0.[12] (*) has an unfortunate bug where it incorrectly turns certain relocation entries absolute. Section-relative symbols that are part of otherwise empty sections are silently changed them to absolute. We rely on section-relative symbols staying section-relative, and actually have several sections in the linker script solely for this purpose. See for example http://sourceware.org/bugzilla/show_bug.cgi?id=14052 We could just black-list the buggy linker, but it appears that it got shipped in at least F17, and possibly other distros too, so it's sadly not some rare unusual case. This backports the workaround from the x86/trampoline branch, and as Peter says: "This is not a minimal fix, not at all, but it is a tested code base." * 'x86/ld-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, relocs: When printing an error, say relative or absolute x86, relocs: Workaround for binutils 2.22.52.0.1 section bug x86, realmode: 16-bit real-mode code support for relocs tool (*) That's a manly release numbering system. Stupid, sure. But manly.
2012-05-18x86, relocs: When printing an error, say relative or absoluteH. Peter Anvin
When the relocs tool throws an error, let the error message say if it is an absolute or relative symbol. This should make it a lot more clear what action the programmer needs to take and should help us find the reason if additional symbol bugs show up. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@vger.kernel.org>
2012-05-18x86, relocs: Workaround for binutils 2.22.52.0.1 section bugH. Peter Anvin
GNU ld 2.22.52.0.1 has a bug that it blindly changes symbols from section-relative to absolute if they are in a section of zero length. This turns the symbols __init_begin and __init_end into absolute symbols. Let the relocs program know that those should be treated as relative symbols. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: <stable@vger.kernel.org> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
2012-05-18x86, realmode: 16-bit real-mode code support for relocs toolH. Peter Anvin
A new option is added to the relocs tool called '--realmode'. This option causes the generation of 16-bit segment relocations and 32-bit linear relocations for the real-mode code. When the real-mode code is moved to the low-memory during kernel initialization, these relocation entries can be used to relocate the code properly. In the assembly code 16-bit segment relocations must be relative to the 'real_mode_seg' absolute symbol. Linear relocations must be relative to a symbol prefixed with 'pa_'. 16-bit segment relocation is used to load cs:ip in 16-bit code. Linear relocations are used in the 32-bit code for relocatable data references. They are declared in the linker script of the real-mode code. The relocs tool is moved to arch/x86/tools/relocs.c, and added new target archscripts that can be used to build scripts needed building an architecture. be compiled before building the arch/x86 tree. [ hpa: accelerating this because it detects invalid absolute relocations, a serious bug in binutils 2.22.52.0.x which currently produces bad kernels. ] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-2-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>
2012-05-18Merge tag 'linus-mce-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull a machine check recovery fix from Tony Luck. I really don't like how the MCE code does some of the things it does, but this does seem to be an improvement. * tag 'linus-mce-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Only restart instruction after machine check recovery if it is safe
2012-05-17Merge branches 'perf-urgent-for-linus', 'x86-urgent-for-linus' and ↵Linus Torvalds
'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf, x86 and scheduler updates from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracing: Do not enable function event with enable perf stat: handle ENXIO error for perf_event_open perf: Turn off compiler warnings for flex and bison generated files perf stat: Fix case where guest/host monitoring is not supported by kernel perf build-id: Fix filename size calculation * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kvm: KVM paravirt kernels don't check for CPUID being unavailable x86: Fix section annotation of acpi_map_cpu2node() x86/microcode: Ensure that module is only loaded on supported Intel CPUs * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption
2012-05-14x86/mce: Only restart instruction after machine check recovery if it is safeTony Luck
Section 15.3.1.2 of the software developer manual has this to say about the RIPV bit in the IA32_MCG_STATUS register: RIPV (restart IP valid) flag, bit 0 — Indicates (when set) that program execution can be restarted reliably at the instruction pointed to by the instruction pointer pushed on the stack when the machine-check exception is generated. When clear, the program cannot be reliably restarted at the pushed instruction pointer. We need to save the state of this bit in do_machine_check() and use it in mce_notify_process() to force a signal; even if memory_failure() says it made a complete recovery ... e.g. replaced a clean LRU page. Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-05-14x86, kvm: KVM paravirt kernels don't check for CPUID being unavailableAlan Cox
We set cpuid_level to -1 if there is no CPUID instruction (only possible on i386). Signed-off-by: Alan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/r/20120514174059.30236.1064.stgit@bluebook Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=12122 Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-09Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Avi Kivity: "Two asynchronous page fault fixes (one guest, one host), a powerpc page refcount fix, and an ia64 build fix." * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: ia64: fix build due to typo KVM: PPC: Book3S HV: Fix refcounting of hugepages KVM: Do not take reference to mm during async #PF KVM: ensure async PF event wakes up vcpu from halt
2012-05-08Merge tag 'stable/for-linus-3.4-rc6-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: - fix to Kconfig to make it fit within 80 line characters, - two bootup fixes (AMD 8-core and with PCI BIOS), - cleanup code in a Xen PV fb driver, - and a crash fix when trying to see non-existent PTE's * tag 'stable/for-linus-3.4-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/Kconfig: fix Kconfig layout xen/pci: don't use PCI BIOS service for configuration space accesses xen/pte: Fix crashes when trying to see non-existent PGD/PMD/PUD/PTEs xen/apic: Return the APIC ID (and version) for CPU 0. drivers/video/xen-fbfront.c: add missing cleanup code
2012-05-08Merge branch 'for-3.4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull two percpu fixes from Tejun Heo: "One adds missing KERN_CONT on split printk()s and the other makes the percpu allocator avoid using PMD_SIZE as atom_size on x86_32. Using PMD_SIZE led to vmalloc area exhaustion on certain configurations (x86_32 android) and the only cost of using PAGE_SIZE instead is static percpu area not being aligned to large page mapping." * 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bit percpu: use KERN_CONT in pcpu_dump_alloc_info()
2012-05-08percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bitTejun Heo
With the embed percpu first chunk allocator, x86 uses either PAGE_SIZE or PMD_SIZE for atom_size. PMD_SIZE is used when CPU supports PSE so that percpu areas are aligned to PMD mappings and possibly allow using PMD mappings in vmalloc areas in the future. Using larger atom_size doesn't waste actual memory; however, it does require larger vmalloc space allocation later on for !first chunks. With reasonably sized vmalloc area, PMD_SIZE shouldn't be a problem but x86_32 at this point is anything but reasonable in terms of address space and using larger atom_size reportedly leads to frequent percpu allocation failures on certain setups. As there is no reason to not use PMD_SIZE on x86_64 as vmalloc space is aplenty and most x86_64 configurations support PSE, fix the issue by always using PMD_SIZE on x86_64 and PAGE_SIZE on x86_32. v2: drop cpu_has_pse test and make x86_64 always use PMD_SIZE and x86_32 PAGE_SIZE as suggested by hpa. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Yanmin Zhang <yanmin.zhang@intel.com> Reported-by: ShuoX Liu <shuox.liu@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <4F97BA98.6010001@intel.com> Cc: stable@vger.kernel.org
2012-05-08x86: Fix section annotation of acpi_map_cpu2node()Jan Beulich
Commit 943bc7e110f2 ("x86: Fix section warnings") added __cpuinitdata here, while for functions __cpuinit should be used. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <sp@numascale.com> Link: http://lkml.kernel.org/r/4FA947910200007800082470@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Steffen Persvold <sp@numascale.com>
2012-05-07xen/pci: don't use PCI BIOS service for configuration space accessesDavid Vrabel
The accessing PCI configuration space with the PCI BIOS32 service does not work in PV guests. On systems without MMCONFIG or where the BIOS hasn't marked the MMCONFIG region as reserved in the e820 map, the BIOS service is probed (even though direct access is preferred) and this hangs. CC: stable@kernel.org Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> [v1: Fixed compile error when CONFIG_PCI is not set] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-07xen/pte: Fix crashes when trying to see non-existent PGD/PMD/PUD/PTEsKonrad Rzeszutek Wilk
If I try to do "cat /sys/kernel/debug/kernel_page_tables" I end up with: BUG: unable to handle kernel paging request at ffffc7fffffff000 IP: [<ffffffff8106aa51>] ptdump_show+0x221/0x480 PGD 0 Oops: 0000 [#1] SMP CPU 0 .. snip.. RAX: 0000000000000000 RBX: ffffc00000000fff RCX: 0000000000000000 RDX: 0000800000000000 RSI: 0000000000000000 RDI: ffffc7fffffff000 which is due to the fact we are trying to access a PFN that is not accessible to us. The reason (at least in this case) was that PGD[256] is set to __HYPERVISOR_VIRT_START which was setup (by the hypervisor) to point to a read-only linear map of the MFN->PFN array. During our parsing we would get the MFN (a valid one), try to look it up in the MFN->PFN tree and find it invalid and return ~0 as PFN. Then pte_mfn_to_pfn would happilly feed that in, attach the flags and return it back to the caller. 'ptdump_show' bitshifts it and gets and invalid value that it tries to dereference. Instead of doing all of that, we detect the ~0 case and just return !_PAGE_PRESENT. This bug has been in existence .. at least until 2.6.37 (yikes!) CC: stable@kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-07xen/apic: Return the APIC ID (and version) for CPU 0.Konrad Rzeszutek Wilk
On x86_64 on AMD machines where the first APIC_ID is not zero, we get: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled) BIOS bug: APIC version is 0 for CPU 1/0x10, fixing up to 0x10 BIOS bug: APIC version mismatch, boot CPU: 0, CPU 1: version 10 which means that when the ACPI processor driver loads and tries to parse the _Pxx states it fails to do as, as it ends up calling acpi_get_cpuid which does this: for_each_possible_cpu(i) { if (cpu_physical_id(i) == apic_id) return i; } And the bootup CPU, has not been found so it fails and returns -1 for the first CPU - which then subsequently in the loop that "acpi_processor_get_info" does results in returning an error, which means that "acpi_processor_add" failing and per_cpu(processor) is never set (and is NULL). That means that when xen-acpi-processor tries to load (much much later on) and parse the P-states it gets -ENODEV from acpi_processor_register_performance() (which tries to read the per_cpu(processor)) and fails to parse the data. Reported-by-and-Tested-by: Stefan Bader <stefan.bader@canonical.com> Suggested-by: Boris Ostrovsky <boris.ostrovsky@amd.com> [v2: Bit-shift APIC ID by 24 bits] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-07x86/microcode: Ensure that module is only loaded on supported Intel CPUsSrivatsa S. Bhat
Exit early when there's no support for a particular CPU family. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Jones <davej@redhat.com> Cc: tigran@aivazian.fsnet.co.uk Cc: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/4F8BDB58.6070007@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-06IA32 emulation: Fix build problem for modular ia32 a.out supportLarry Finger
Commit ce7e5d2d19bc ("x86: fix broken TASK_SIZE for ia32_aout") breaks kernel builds when "CONFIG_IA32_AOUT=m" with ERROR: "set_personality_ia32" [arch/x86/ia32/ia32_aout.ko] undefined! make[1]: *** [__modpost] Error 1 The entry point needs to be exported. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Acked-by: Al Viro <viro@zeniv.linux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-06Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes form Peter Anvin * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: intel_mid_powerbtn: mark irq as IRQF_NO_SUSPEND arch/x86/platform/geode/net5501.c: change active_low to 0 for LED driver x86, relocs: Remove an unused variable asm-generic: Use __BITS_PER_LONG in statfs.h x86/amd: Re-enable CPU topology extensions in case BIOS has disabled it
2012-05-06x86: fix broken TASK_SIZE for ia32_aoutAl Viro
Setting TIF_IA32 in load_aout_binary() used to be enough; these days TASK_SIZE is controlled by TIF_ADDR32 and that one doesn't get set there. Switch to use of set_personality_ia32()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-06KVM: Do not take reference to mm during async #PFGleb Natapov
It turned to be totally unneeded. The reason the code was introduced is so that KVM can prefault swapped in page, but prefault can fail even if mm is pinned since page table can change anyway. KVM handles this situation correctly though and does not inject spurious page faults. Fixes: "INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected" warning while running LTP inside a KVM guest using the recent -next kernel. Reported-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-06KVM: ensure async PF event wakes up vcpu from haltGleb Natapov
If vcpu executes hlt instruction while async PF is waiting to be delivered vcpu can block and deliver async PF only after another even wakes it up. This happens because kvm_check_async_pf_completion() will remove completion event from vcpu->async_pf.done before entering kvm_vcpu_block() and this will make kvm_arch_vcpu_runnable() return false. The solution is to make vcpu runnable when processing completion. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-04arch/x86/platform/geode/net5501.c: change active_low to 0 for LED driverBjarke Istrup Pedersen
It seems that there was an error with the active_low = 1 for the LED, since it should be set to 0 (meaning that active is high, since 0 is false, hence the confusion. The wiki article about it confuses it, since it contradicts itself, regarding what turns on the LED. I have tested 3.4-rc2 on my net5501 with this patch, and it makes the LED behave correctly, where "none" turns it off, and "default-on" turns it on, when echoed onto the trigger "file" in /sys/class/leds. Signed-off-by: Bjarke Istrup Pedersen <gurligebis@gentoo.org> Link: http://lkml.kernel.org/r/20120504210146.62186A018B@akpm.mtv.corp.google.com Cc: Philip Prindeville <philipp@redfish-solutions.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-05-03vfs: make word-at-a-time accesses handle a non-existing pageLinus Torvalds
It turns out that there are more cases than CONFIG_DEBUG_PAGEALLOC that can have holes in the kernel address space: it seems to happen easily with Xen, and it looks like the AMD gart64 code will also punch holes dynamically. Actually hitting that case is still very unlikely, so just do the access, and take an exception and fix it up for the very unlikely case of it being a page-crosser with no next page. And hey, this abstraction might even help other architectures that have other issues with unaligned word accesses than the possible missing next page. IOW, this could do the byte order magic too. Peter Anvin fixed a thinko in the shifting for the exception case. Reported-and-tested-by: Jana Saout <jana@saout.de> Cc: Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-30x86, relocs: Remove an unused variableKusanagi Kouichi
sh_symtab is set but not used. [ hpa: putting this in urgent because of the sheer harmlessness of the patch: it quiets a build warning but does not change any generated code. ] Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Link: http://lkml.kernel.org/r/20120401082932.D5E066FC03D@msa105.auone-net.jp Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>
2012-04-27Merge tag 'stable/for-linus-3.4-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen fixes from Konrad Rzeszutek Wilk: "Some of these had been in existence since the 2.6.27 days, some since 3.0 - and some due to new features added in v3.4. The one that is most interesting is David's one - in the low-level assembler code we had be checking events needlessly. With his patch now we do it when the appropriate flag is set - with the added benefit that we can process events faster. Stefano's is fixing a mistake where the Linux IRQ numbers were ACK-ed instead of the Xen IRQ, resulting in missing interrupts. The other ones are bootup related that can show up on various hardware." - In the low-level assembler code we would jump to check events even if none were present. This incorrect behavior had been there since 2.6.27 days! - When using the fast-path for ACK-ing interrupts we were using the Linux IRQ numbers instead of the Xen ones (and they can differ) and missing interrupts in process. - Fix bootup crashes when ACPI hotplug CPUs were present and they would expand past the set number of CPUs we were allocated. - Deal with broken BIOSes when uploading C-states to the hypervisor. - Disable the cpuid check for MWAIT_LEAF if the ACPI PAD driver is loaded. If the ACPI PAD driver is used it will crash, so lets not export the functionality so the ACPI PAD driver won't load. * tag 'stable/for-linus-3.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: correctly check for pending events when restoring irq flags xen/acpi: Workaround broken BIOSes exporting non-existing C-states. xen/smp: Fix crash when booting with ACPI hotplug CPUs. xen: use the pirq number to check the pirq_eoi_map xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.
2012-04-27Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Use x2apic physical mode based on FADT setting x86/mrst: Quiet sparse noise about plain integer as NULL pointer x86, intel_cacheinfo: Fix error return code in amd_set_l3_disable_slot()
2012-04-27xen: correctly check for pending events when restoring irq flagsDavid Vrabel
In xen_restore_fl_direct(), xen_force_evtchn_callback() was being called even if no events were pending. This resulted in (depending on workload) about a 100 times as many xen_version hypercalls as necessary. Fix this by correcting the sense of the conditional jump. This seems to give a significant performance benefit for some workloads. There is some subtle tricksy "..since the check here is trying to check both pending and masked in a single cmpw, but I think this is correct. It will call check_events now only when the combined mask+pending word is 0x0001 (aka unmasked, pending)." (Ian) CC: stable@kernel.org Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-27x86/amd: Re-enable CPU topology extensions in case BIOS has disabled itAndreas Herrmann
BIOS will switch off the corresponding feature flag on family 15h models 10h-1fh non-desktop CPUs. The topology extension CPUID leafs are required to detect which cores belong to the same compute unit. (thread siblings mask is set accordingly and also correct information about L1i and L2 cache sharing depends on this). W/o this patch we wouldn't see which cores belong to the same compute unit and also cache sharing information for L1i and L2 would be incorrect on such systems. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-26xen/smp: Fix crash when booting with ACPI hotplug CPUs.Konrad Rzeszutek Wilk
When we boot on a machine that can hotplug CPUs and we are using 'dom0_max_vcpus=X' on the Xen hypervisor line to clip the amount of CPUs available to the initial domain, we get this: (XEN) Command line: com1=115200,8n1 dom0_mem=8G noreboot dom0_max_vcpus=8 sync_console mce_verbosity=verbose console=com1,vga loglvl=all guest_loglvl=all .. snip.. DMI: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.99.99.x032.072520111118 07/25/2011 .. snip. SMP: Allowing 64 CPUs, 32 hotplug CPUs installing Xen timer for CPU 7 cpu 7 spinlock event irq 361 NMI watchdog: disabled (cpu7): hardware events not enabled Brought up 8 CPUs .. snip.. [acpi processor finds the CPUs are not initialized and starts calling arch_register_cpu, which creates /sys/devices/system/cpu/cpu8/online] CPU 8 got hotplugged CPU 9 got hotplugged CPU 10 got hotplugged .. snip.. initcall 1_acpi_battery_init_async+0x0/0x1b returned 0 after 406 usecs calling erst_init+0x0/0x2bb @ 1 [and the scheduler sticks newly started tasks on the new CPUs, but said CPUs cannot be initialized b/c the hypervisor has limited the amount of vCPUS to 8 - as per the dom0_max_vcpus=8 flag. The spinlock tries to kick the other CPU, but the structure for that is not initialized and we crash.] BUG: unable to handle kernel paging request at fffffffffffffed8 IP: [<ffffffff81035289>] xen_spin_lock+0x29/0x60 PGD 180d067 PUD 180e067 PMD 0 Oops: 0002 [#1] SMP CPU 7 Modules linked in: Pid: 1, comm: swapper/0 Not tainted 3.4.0-rc2upstream-00001-gf5154e8 #1 Intel Corporation S2600CP/S2600CP RIP: e030:[<ffffffff81035289>] [<ffffffff81035289>] xen_spin_lock+0x29/0x60 RSP: e02b:ffff8801fb9b3a70 EFLAGS: 00010282 With this patch, we cap the amount of vCPUS that the initial domain can run, to exactly what dom0_max_vcpus=X has specified. In the future, if there is a hypercall that will allow a running domain to expand past its initial set of vCPUS, this patch should be re-evaluated. CC: stable@kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-26xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.Konrad Rzeszutek Wilk
There are exactly four users of __monitor and __mwait: - cstate.c (which allows acpi_processor_ffh_cstate_enter to be called when the cpuidle API drivers are used. However patch "cpuidle: replace xen access to x86 pm_idle and default_idle" provides a mechanism to disable the cpuidle and use safe_halt. - smpboot (which allows mwait_play_dead to be called). However safe_halt is always used so we skip that. - intel_idle (same deal as above). - acpi_pad.c. This the one that we do not want to run as we will hit the below crash. Why do we want to expose MWAIT_LEAF in the first place? We want it for the xen-acpi-processor driver - which uploads C-states to the hypervisor. If MWAIT_LEAF is set, the cstate.c sets the proper address in the C-states so that the hypervisor can benefit from using the MWAIT functionality. And that is the sole reason for using it. Without this patch, if a module performs mwait or monitor we get this: invalid opcode: 0000 [#1] SMP CPU 2 .. snip.. Pid: 5036, comm: insmod Tainted: G O 3.4.0-rc2upstream-dirty #2 Intel Corporation S2600CP/S2600CP RIP: e030:[<ffffffffa000a017>] [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check] RSP: e02b:ffff8801c298bf18 EFLAGS: 00010282 RAX: ffff8801c298a010 RBX: ffffffffa03b2000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8801c29800d8 RDI: ffff8801ff097200 RBP: ffff8801c298bf18 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: ffffffffa000a000 R14: 0000005148db7294 R15: 0000000000000003 FS: 00007fbb364f2700(0000) GS:ffff8801ff08c000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000179f038 CR3: 00000001c9469000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process insmod (pid: 5036, threadinfo ffff8801c298a000, task ffff8801c29cd7e0) Stack: ffff8801c298bf48 ffffffff81002124 ffffffffa03b2000 00000000000081fd 000000000178f010 000000000178f030 ffff8801c298bf78 ffffffff810c41e6 00007fff3fb30db9 00007fff3fb30db9 00000000000081fd 0000000000010000 Call Trace: [<ffffffff81002124>] do_one_initcall+0x124/0x170 [<ffffffff810c41e6>] sys_init_module+0xc6/0x220 [<ffffffff815b15b9>] system_call_fastpath+0x16/0x1b Code: <0f> 01 c8 31 c0 0f 01 c9 c9 c3 00 00 00 00 00 00 00 00 00 00 00 00 RIP [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check] RSP <ffff8801c298bf18> ---[ end trace 16582fc8a3d1e29a ]--- Kernel panic - not syncing: Fatal exception With this module (which is what acpi_pad.c would hit): MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>"); MODULE_DESCRIPTION("mwait_check_and_back"); MODULE_LICENSE("GPL"); MODULE_VERSION(); static int __init mwait_check_init(void) { __monitor((void *)&current_thread_info()->flags, 0, 0); __mwait(0, 0); return 0; } static void __exit mwait_check_exit(void) { } module_init(mwait_check_init); module_exit(mwait_check_exit); Reported-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-25Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from H. Peter Anvin. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x32, siginfo: Provide proper overrides for x32 siginfo_t asm-generic: Allow overriding clock_t and add attributes to siginfo_t x32: Check __ILP32__ instead of __LP64__ for x32 x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from assembler ACPI: Convert wake_sleep_flags to a value instead of function x86, apic: APIC code touches invalid MSR on P5 class machines i387: ptrace breaks the lazy-fpu-restore logic x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id() x86, efi: Add dedicated EFI stub entry point x86/amd: Remove broken links from comment and kernel message x86, microcode: Ensure that module is only loaded on supported AMD CPUs x86, microcode: Fix sysfs warning during module unload on unsupported CPUs
2012-04-25x86/apic: Use x2apic physical mode based on FADT settingGreg Pearson
Provide systems that do not support x2apic cluster mode a mechanism to select x2apic physical mode using the FADT FORCE_APIC_PHYSICAL_DESTINATION_MODE bit. Changes from v1: (based on Suresh's comments) - removed #ifdef CONFIG_ACPI - removed #include <linux/acpi.h> Signed-off-by: Greg Pearson <greg.pearson@hp.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1335313436-32020-1-git-send-email-greg.pearson@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25x86/mrst: Quiet sparse noise about plain integer as NULL pointerH Hartley Sweeten
The second parameter to intel_scu_notifier_post is a void *, not an integer. This quiets the sparse noise: arch/x86/platform/mrst/mrst.c:808:48: warning: Using plain integer as NULL pointer arch/x86/platform/mrst/mrst.c:817:43: warning: Using plain integer as NULL pointer Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Acked-by: Alan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/r/201204241500.53685.hartleys@visionengravers.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25Merge tag 'l3-fix-for-3.5' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent A small L3 cache index disable fix from Srivatsa Bhat which unifies the way the code checks for already disabled indices. ( Pulling it into v3.4 despite the v3.5 tag - the fix is small and we better keep the same code across kernel versions for such user facing interfaces. ) Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-23x32, siginfo: Provide proper overrides for x32 siginfo_tH. Peter Anvin
Provide the proper override macros for x32 siginfo_t. The combination of a special type here and an overall alignment constraint actually ends up with all the types being properly aligned, but the hack is needed to keep the substructures inside siginfo_t from adding padding. Note: use __attribute__((aligned())) since __aligned() is not exported to user space. [ v2: fix stray semicolon ] Reported-by: H.J. Lu <hjl.rools@gmail.com> Cc: Bruce J. Beare <bruce.j.beare@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/CAMe9rOqF6Kh6-NK7oP0Fpzkd4SBAWU%2BG53hwBbSD4iA2UzyxuA@mail.gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-23x32: Check __ILP32__ instead of __LP64__ for x32H.J. Lu
Check __LP64__ isn't a reliable way to tell if we are compiling for x32 since __LP64__ isnn't specified by x86-64 psABI. Not all x86-64 compilers define __LP64__, which was added to GCC 3.3. The updated x32 psABI: https://sites.google.com/site/x32abi/documents definse _ILP32 and __ILP32__ for x32. GCC trunk and 4.7 branch have been updated to define _ILP32 and __ILP32__ for x32. This patch replaces __LP64__ check with __ILP32__. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-04-23x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from ↵Konrad Rzeszutek Wilk
assembler With commit a2ef5c4fd44ce3922435139393b89f2cce47f576 "ACPI: Move module parameter gts and bfs to sleep.c" the wake_sleep_flags is required when calling acpi_enter_sleep_state. The assembler code in wakeup_*.S did not do that. One solution is to call it from assembler and stick the wake_sleep_flags on the stack (for 32-bit) or in %esi (for 64-bit). hpa and rafael both suggested however to create a wrapper function to call acpi_enter_sleep_state and call said wrapper function ("acpi_enter_s3") from assembler. For 32-bit, the acpi_enter_s3 ends up looking as so: push %ebp mov %esp,%ebp sub $0x8,%esp movzbl 0xc1809314,%eax [wake_sleep_flags] movl $0x3,(%esp) mov %eax,0x4(%esp) call 0xc12d1fa0 <acpi_enter_sleep_state> leave ret And 64-bit: movzbl 0x9afde1(%rip),%esi [wake_sleep_flags] push %rbp mov $0x3,%edi mov %rsp,%rbp callq 0xffffffff812e9800 <acpi_enter_sleep_state> leaveq retq Reviewed-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: H. Peter Anvin <hpa@zytor.com> [v2: Remove extra assembler operations, per hpa review] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1335150198-21899-3-git-send-email-konrad.wilk@oracle.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-23ACPI: Convert wake_sleep_flags to a value instead of functionKonrad Rzeszutek Wilk
With commit a2ef5c4fd44ce3922435139393b89f2cce47f576 "ACPI: Move module parameter gts and bfs to sleep.c" the wake_sleep_flags is required when calling acpi_enter_sleep_state, which means that if there are functions outside the sleep.c code they can't get the wake_sleep_flags values. This converts the function in to a exported value and converts the module config operands to a function. Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Lin Ming <ming.m.lin@intel.com> [v2: Parameters can be turned on/off dynamically] [v3: unsigned char -> u8] [v4: val -> kp->arg] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1335150198-21899-2-git-send-email-konrad.wilk@oracle.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-21kill mm argument of vm_munmap()Al Viro
it's always current->mm Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-20VM: add "vm_mmap()" helper functionLinus Torvalds
This continues the theme started with vm_brk() and vm_munmap(): vm_mmap() does the same thing as do_mmap(), but additionally does the required VM locking. This uninlines (and rewrites it to be clearer) do_mmap(), which sadly duplicates it in mm/mmap.c and mm/nommu.c. But that way we don't have to export our internal do_mmap_pgoff() function. Some day we hopefully don't have to export do_mmap() either, if all modular users can become the simpler vm_mmap() instead. We're actually very close to that already, with the notable exception of the (broken) use in i810, and a couple of stragglers in binfmt_elf. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-20VM: add "vm_munmap()" helper functionLinus Torvalds
Like the vm_brk() function, this is the same as "do_munmap()", except it does the VM locking for the caller. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-20VM: add "vm_brk()" helper functionLinus Torvalds
It does the same thing as "do_brk()", except it handles the VM locking too. It turns out that all external callers want that anyway, so we can make do_brk() static to just mm/mmap.c while at it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-20Revert "xen/p2m: m2p_find_override: use list_for_each_entry_safe"Konrad Rzeszutek Wilk
This reverts commit b960d6c43a63ebd2d8518b328da3816b833ee8cc. If we have another thread (very likely) touched the list, we end up hitting a problem "that the next element is wrong because we should be able to cope with that. The problem is that the next->next pointer would be set LIST_POISON1. " (Stefano's comment on the patch). Reverting for now. Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-19Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: lock slots_lock around device assignment KVM: VMX: Fix kvm_set_shared_msr() called in preemptible context KVM: unmap pages from the iommu when slots are removed KVM: PMU emulation: GLOBAL_CTRL MSR should be enabled on reset
2012-04-19x86, intel_cacheinfo: Fix error return code in amd_set_l3_disable_slot()Srivatsa S. Bhat
If the L3 disable slot is already in use, return -EEXIST instead of -EINVAL. The caller, store_cache_disable(), checks this return value to print an appropriate warning. Also, we want to signal with -EEXIST that the current index we're disabling has actually been already disabled on the node: $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0 $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0 -bash: echo: write error: File exists $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_1 -bash: echo: write error: File exists $ echo 12 > /sys/devices/system/cpu/cpu5/cache/index3/cache_disable_1 -bash: echo: write error: File exists The old code would say -bash: echo: write error: Invalid argument for disable slot 1 when playing the example above with no output in dmesg, which is clearly misleading. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120419070053.GB16645@elgon.mountain [Boris: add testing for the other index too] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-18KVM: VMX: Fix kvm_set_shared_msr() called in preemptible contextAvi Kivity
kvm_set_shared_msr() may not be called in preemptible context, but vmx_set_msr() does so: BUG: using smp_processor_id() in preemptible [00000000] code: qemu-kvm/22713 caller is kvm_set_shared_msr+0x32/0xa0 [kvm] Pid: 22713, comm: qemu-kvm Not tainted 3.4.0-rc3+ #39 Call Trace: [<ffffffff8131fa82>] debug_smp_processor_id+0xe2/0x100 [<ffffffffa0328ae2>] kvm_set_shared_msr+0x32/0xa0 [kvm] [<ffffffffa03a103b>] vmx_set_msr+0x28b/0x2d0 [kvm_intel] ... Making kvm_set_shared_msr() work in preemptible is cleaner, but it's used in the fast path. Making two variants is overkill, so this patch just disables preemption around the call. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-04-18Merge commit 'c104f1fa1ecf4ee0fc06e31b1f77630b2551be81' into ↵Konrad Rzeszutek Wilk
stable/for-linus-3.4 * commit 'c104f1fa1ecf4ee0fc06e31b1f77630b2551be81': (14566 commits) cpufreq: OMAP: fix build errors: depends on ARCH_OMAP2PLUS sparc64: Eliminate obsolete __handle_softirq() function sparc64: Fix bootup crash on sun4v. kconfig: delete last traces of __enabled_ from autoconf.h Revert "kconfig: fix __enabled_ macros definition for invisible and un-selected symbols" kconfig: fix IS_ENABLED to not require all options to be defined irq_domain: fix type mismatch in debugfs output format staging: android: fix mem leaks in __persistent_ram_init() staging: vt6656: Don't leak memory in drivers/staging/vt6656/ioctl.c::private_ioctl() staging: iio: hmc5843: Fix crash in probe function. panic: fix stack dump print on direct call to panic() drivers/rtc/rtc-pl031.c: enable clock on all ST variants Revert "mm: vmscan: fix misused nr_reclaimed in shrink_mem_cgroup_zone()" hugetlb: fix race condition in hugetlb_fault() drivers/rtc/rtc-twl.c: use static register while reading time drivers/rtc/rtc-s3c.c: add placeholder for driver private data drivers/rtc/rtc-s3c.c: fix compilation error MAINTAINERS: add PCDP console maintainer memcg: do not open code accesses to res_counter members drivers/rtc/rtc-efi.c: fix section mismatch warning ...
2012-04-18x86, apic: APIC code touches invalid MSR on P5 class machinesBryan O'Donoghue
Current APIC code assumes MSR_IA32_APICBASE is present for all systems. Pentium Classic P5 and friends didn't have this MSR. MSR_IA32_APICBASE was introduced as an architectural MSR by Intel @ P6. Code paths that can touch this MSR invalidly are when vendor == Intel && cpu-family == 5 and APIC bit is set in CPUID - or when you simply pass lapic on the kernel command line, on a P5. The below patch stops Linux incorrectly interfering with the MSR_IA32_APICBASE for P5 class machines. Other code paths exist that touch the MSR - however those paths are not currently reachable for a conformant P5. Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linux.intel.com> Link: http://lkml.kernel.org/r/4F8EEDD3.1080404@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>
2012-04-17xen/p2m: m2p_find_override: use list_for_each_entry_safeStefano Stabellini
Use list_for_each_entry_safe and remove the spin_lock acquisition in m2p_find_override. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>