linux.git - Linux Kernel

Age	Commit message (Collapse)	Author
2016-11-12	powerpc: Fix exception vector build with 2.23 era binutils	Hugh Dickins
	The changes to use gas sections for constructing the exception vectors causes a build break when using binutils 2.23: arch/powerpc/kernel/exceptions-64s.S:770: Error: operand out of range (0xffffffffffff8100 is not between 0x0000000000000000 and 0x000000000000ffff) And so on. Reported by Hugh with binutils-2.23.2-8.1.4.ppc64 from openSUSE 13.1 and also Naveen & Denis using 2.23.52.0.1-26.el7 from RHEL 7. Strangely binutils 2.22 (what I test with) is not affected. This is caused by the use of @l in LOAD_HANDLER(). The @l was only recently added in commit a24553dd02dc ("powerpc/pseries: Remove unnecessary syscall trampoline"). Luckily the gas section changes split out the LOAD_SYSCALL_HANDLER() macro, which means we actually don't need to use @l in LOAD_HANDLER() any more, only in LOAD_SYSCALL_HANDLER(). So drop the @l from LOAD_HANDLER(). Fixes: 57f266497d81 ("powerpc: Use gas sections for arranging exception vectors") Signed-off-by: Hugh Dickins <hughd@google.com> [mpe: Add gory details to change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-12	powerpc/64s: Fix system reset interrupt winkle wakeups	Nicholas Piggin
	Wakeups from winkle set the low bit of the HSPRG0 register, to distinguish it from other sleep states. This is also the PACA pointer. The system reset exception handler fails to mask this bit away before using this value before using it as the PACA pointer. Fix this by adding a new type of exception prolog macro where we already have the PACA set in r13, and have the system reset vector mask it out. The winkle wakeup handler will store the masked value back into HSPRG0. Fixes: fb479e44a9e2 ("powerpc/64s: relocation, register save fixes for system reset interrupt") Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-27	powerpc/64s: relocation, register save fixes for system reset interrupt	Nicholas Piggin
	This patch does a couple of things. First of all, powernv immediately explodes when running a relocated kernel, because the system reset exception for handling sleeps does not do correct relocated branches. Secondly, the sleep handling code trashes the condition and cfar registers, which we would like to preserve for debugging purposes (for non-sleep case exception). This patch changes the exception to use the standard format that saves registers before any tests or branches are made. It adds the test for idle-wakeup as an "extra" to break out of the normal exception path. Then it branches to a relocated idle handler that calls the various idle handling functions. After this patch, POWER8 CPU simulator now boots powernv kernel that is running at non-zero. Fixes: 948cf67c4726 ("powerpc: Add NAP mode support on Power7 in HV mode") Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: Use gas sections for arranging exception vectors	Nicholas Piggin
	Use assembler sections of fixed size and location to arrange the 64-bit Book3S exception vector code (64-bit Book3E also uses it in head_64.S for 0x0..0x100). This allows better flexibility in arranging exception code and hiding unimportant details behind macros. Gas sections can be a bit painful to use this way, mainly because the assembler does not know where they will be finally linked. Taking absolute addresses requires a bit of trickery for example, but it can be hidden behind macros for the most part. Generated code is mostly the same except locations, offsets, alignments. The "+ 0x2" is only required for the trap number / kvm exit number, which gets loaded as a constant into a register. Previously, code also used + 0x2 for label names, but we changed to using "H" to distinguish HV case for that. Remove the last vestiges of that. __after_prom_start is taking absolute address of a label in another fixed section. Newer toolchains seemed to compile this okay, but older ones do not. FIXED_SYMBOL_ABS_ADDR is more foolproof, it just takes an additional line to define. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Add new exception vector macros	Michael Ellerman
	Create arch/powerpc/include/asm/head-64.h with macros that specify an exception vector (name, type, location), which will be used to label and lay out exceptions into the object file. Naming is moved out of exception-64s.h, which is used to specify the implementation of exception handlers. objdump of generated code in exception vectors is unchanged except for names. Alignment directives scattered around are annoying, but done this way so that disassembly can verify identical instruction generation before and after patch. These get cleaned up in future patch. We change the way KVMTEST works, explicitly passing EXC_HV or EXC_STD rather than overloading the trap number. This removes the need to have SOFTEN values for the overloaded trap numbers, eg. 0x502. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23	powerpc/pseries: Remove unnecessary syscall trampoline	Nicholas Piggin
	When we originally added the ability to split the exception vectors from the kernel (commit 1f6a93e4c35e ("powerpc: Make it possible to move the interrupt handlers away from the kernel" 2008-09-15)), the LOAD_HANDLER() macro used an addi instruction to compute the offset of the common handler from the kernel base address. Using addi meant the handler had to be within 32K of the kernel base address, due to the addi instruction taking a signed immediate value. That necessitated creating a trampoline for the system call handler, because system_call_common (in entry64.S) is not linked within 32K of the kernel base address. Later in commit 61e2390ede3c ("powerpc: Make load_hander handle upto 64k offset" 2012-11-15) we changed LOAD_HANDLER to take a 64K offset, by changing it to use ori. Although system_call_common is not in head_64.S or exceptions-64s.S, it is included in head-y, which causes it to be linked early in the kernel text, so in practice it ends up below 64K. Additionally if it can't be placed below 64K the linker will fail to build with a "relocation truncated to fit" error. So remove the trampoline. Newer toolchains are able to work out that the ori in LOAD_HANDLER only takes a 16 bit offset, and so they generate a 16 bit relocation. Older toolchains (binutils 2.22 at least) are not so smart, so we have to add the @l annotation to tell the assembler to generate a 16 bit relocation. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13	powerpc/64: Do load of PACAKBASE in LOAD_HANDLER	Michael Ellerman
	The LOAD_HANDLER macro requires that you have previously loaded "reg" with PACAKBASE. Although that gives callers flexibility to get PACAKBASE in some interesting way, none of the callers actually do that. So fold the load of PACAKBASE into the macro, making it simpler for callers to use correctly. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nick Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13	powerpc/64: Correct comment on LOAD_HANDLER()	Michael Ellerman
	The comment for LOAD_HANDLER() was wrong. The part about kdump has not been true since 1f6a93e4c35e ("powerpc: Make it possible to move the interrupt handlers away from the kernel"). Describe how it currently works, and combine the two separate comments into one. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nick Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-30	Merge branch 'next' of ↵	Michael Ellerman
	git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include more 8xx optimizations, device tree updates, and MVME7100 support."
2016-07-17	powerpc/irq: Add support for HV virtualization interrupts	Benjamin Herrenschmidt
	This will be delivering external interrupts from the XIVE to the Hypervisor. We treat it as a normal external interrupt for the lazy irq disable code (so it will be replayed as a 0x500) and route it to do_IRQ. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-09	powerpc32: provide VIRT_CPU_ACCOUNTING	Christophe Leroy
	This patch provides VIRT_CPU_ACCOUTING to PPC32 architecture. PPC32 doesn't have the PACA structure, so we use the task_info structure to store the accounting data. In order to reuse on PPC32 the PPC64 functions, all u64 data has been replaced by 'unsigned long' so that it is u32 on PPC32 and u64 on PPC64 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2015-12-17	powerpc/kernel: Combine vec/loc for STD_EXCEPTION_PSERIES	Michael Ellerman
	The STD_EXCEPTION_PSERIES macro takes both a vector number, and a location (memory address). However both are always identical, so combine them to save repeating ourselves. This does mean an exception handler must always exist at the location in memory that matches its vector number. But that's OK because this is the "STD" macro (standard), which does exactly that. We have other macros for the other cases, eg. STD_EXCEPTION_PSERIES_OOL (out of line). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17	powerpc/kernel: Drop HMT_MEDIUM_PPR_DISCARD	Michael Ellerman
	HMT_MEDIUM_PPR_DISCARD is a macro which is present at the start of most of our first level exception handlers. It conditionally executes a HMT_MEDIUM instruction, which sets the processor priority to medium. On on modern systems, ie. Power7 and later, it is nop'ed out at boot. All it does is make the exception vectors more cramped, and consume 4 bytes of icache. On old systems it has the effect of boosting the processor priority at the start of exception processing. If we were previously in the idle loop for example, we may be at low or very low priority. This is desirable as we want to process the exception as fast as possible. However looking closely at the generated code, we see that in all cases we execute another HMT_MEDIUM just four instructions later. With code patching applied, the final code on an old (Power6) system will look like, eg: c000000000000300 <data_access_pSeries>: c000000000000300: 7c 42 13 78 mr r2,r2 <- c000000000000304: 7d b2 43 a6 mtsprg 2,r13 c000000000000308: 7d b1 42 a6 mfsprg r13,1 c00000000000030c: f9 2d 00 80 std r9,128(r13) c000000000000310: 60 00 00 00 nop c000000000000314: 7c 42 13 78 mr r2,r2 <- So I suggest that the added code complexity of HMT_MEDIUM_PPR_DISCARD is not justified by the benefit of boosting the processor priority for the duration of four instructions, and therefore we drop it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01	powerpc/64: Include KVM guest test in all interrupt vectors	Paul Mackerras
	Currently, if HV KVM is configured but PR KVM isn't, we don't include a test to see whether we were interrupted in KVM guest context for the set of interrupts which get delivered directly to the guest by hardware if they occur in the guest. This includes things like program interrupts. However, the recent bug where userspace could set the MSR for a VCPU to have an illegal value in the TS field, and thus cause a TM Bad Thing type of program interrupt on the hrfid that enters the guest, showed that we can never be completely sure that these interrupts can never occur in the guest entry/exit code. If one of these interrupts does happen and we have HV KVM configured but not PR KVM, then we end up trying to run the handler in the host with the MMU set to the guest MMU context, which generally ends badly. Thus, for robustness it is better to have the test in every interrupt vector, so that if some way is found to trigger some interrupt in the guest entry/exit path, we can handle it without immediately crashing the host. This means that the distinction between KVMTEST and KVMTEST_PR goes away. Thus we delete KVMTEST_PR and associated macros and use KVMTEST everywhere that we previously used either KVMTEST_PR or KVMTEST. It also means that SOFTEN_TEST_HV_201 becomes the same as SOFTEN_TEST_PR, so we deleted SOFTEN_TEST_HV_201 and use SOFTEN_TEST_PR instead. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-08-05	powerpc/book3s: Add basic infrastructure to handle HMI in Linux.	Mahesh Salgaonkar
	Handle Hypervisor Maintenance Interrupt (HMI) in Linux. This patch implements basic infrastructure to handle HMI in Linux host. The design is to invoke opal handle hmi in real mode for recovery and set irq_pending when we hit HMI. During check_irq_replay pull opal hmi event and print hmi info on console. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28	powerpc: Remove misleading DISABLE_INTS	Michael Ellerman
	DISABLE_INTS has a long and storied history, but for some time now it has not actually disabled interrupts. For the open-coded exception handlers, just stop using it, instead call RECONCILE_IRQ_STATE directly. This has the benefit of removing a level of indirection, and making it clear that r10 & r11 are used at that point. For the addition case we still need a macro, so rename it to clarify what it actually does. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28	powerpc: Document register clobbering in EXCEPTION_COMMON()	Michael Ellerman
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-23	powerpc: No need to use dot symbols when branching to a function	Anton Blanchard
	binutils is smart enough to know that a branch to a function descriptor is actually a branch to the functions text address. Alan tells me that binutils has been doing this for 9 years. Signed-off-by: Anton Blanchard <anton@samba.org>
2014-03-24	powerpc/book3s: Fix CFAR clobbering issue in machine check handler.	Mahesh Salgaonkar
	While checking powersaving mode in machine check handler at 0x200, we clobber CFAR register. Fix it by saving and restoring it during beq/bgt. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30	Merge branch 'merge' into next	Benjamin Herrenschmidt
	Merge a pile of fixes that went into the "merge" branch (3.13-rc's) such as Anton Little Endian fixes. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30	powerpc: Fix bad stack check in exception entry	Michael Neuling
	In EXCEPTION_PROLOG_COMMON() we check to see if the stack pointer (r1) is valid when coming from the kernel. If it's not valid, we die but with a nice oops message. Currently we allocate a stack frame (subtract INT_FRAME_SIZE) before we check to see if the stack pointer is negative. Unfortunately, this won't detect a bad stack where r1 is less than INT_FRAME_SIZE. This patch fixes the check to compare the modified r1 with -INT_FRAME_SIZE. With this, bad kernel stack pointers (including NULL pointers) are correctly detected again. Kudos to Paulus for finding this. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05	powerpc/book3s: Split the common exception prolog logic into two section.	Mahesh Salgaonkar
	This patch splits the common exception prolog logic into three parts to facilitate reuse of existing code in the next patch. This patch also re-arranges few instructions in such a way that the second part now deals with saving register values from paca save area to stack frame, and the third part deals with saving current register values to stack frame. The second and third part will be reused in the machine check exception routine in the subsequent patch. Please note that this patch does not introduce or change existing code logic. Instead it is just a code movement and instruction re-ordering. Patch Acked-by Paul. But made some minor modification (explained above) to address Paul's comment in the later patch(3). Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-17	kvm: powerpc: book3s: Cleanup interrupt handling code	Aneesh Kumar K.V
	With this patch if HV is included, interrupts come in to the HV version of the kvmppc_interrupt code, which then jumps to the PR handler, renamed to kvmppc_interrupt_pr, if the guest is a PR guest. This helps in enabling both HV and PR, which we do in later patch Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-17	kvm: powerpc: book3s: pr: Rename KVM_BOOK3S_PR to KVM_BOOK3S_PR_POSSIBLE	Aneesh Kumar K.V
	With later patches supporting PR kvm as a kernel module, the changes that has to be built into the main kernel binary to enable PR KVM module is now selected via KVM_BOOK3S_PR_POSSIBLE Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-17	KVM: PPC: Book3S HV: Add support for guest Program Priority Register	Paul Mackerras
	POWER7 and later IBM server processors have a register called the Program Priority Register (PPR), which controls the priority of each hardware CPU SMT thread, and affects how fast it runs compared to other SMT threads. This priority can be controlled by writing to the PPR or by use of a set of instructions of the form or rN,rN,rN which are otherwise no-ops but have been defined to set the priority to particular levels. This adds code to context switch the PPR when entering and exiting guests and to make the PPR value accessible through the SET/GET_ONE_REG interface. When entering the guest, we set the PPR as late as possible, because if we are setting a low thread priority it will make the code run slowly from that point on. Similarly, the first-level interrupt handlers save the PPR value in the PACA very early on, and set the thread priority to the medium level, so that the interrupt handling code runs at a reasonable speed. Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-08-14	powerpc: Avoid link stack corruption for MMU on exceptions	Michael Neuling
	When we have MMU on exceptions (POWER8) and a relocatable kernel, we need to branch from the initial exception vectors at 0x0 to up high where the kernel might be located. Currently we do this using the link register. Unfortunately this corrupts the link stack and instead we should use the count register. We did this for the syscall entry path in: 6a40480 powerpc: Avoid link stack corruption in MMU on syscall entry path but I stupidly forgot to do the same for other exceptions. This patch changes the initial exception vectors to use the count register instead of the link register when we need to branch up to the relocated kernel. I have a dodgy userspace test which loops calling a function that reads the PVR (mfpvr in userspace will be emulated by the kernel via the program check exception). On POWER8 and with CONFIG_RELOCATABLE=y, I get a ~10% performance improvement with my userspace test with this patch. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14	powerpc/ppc64: Rename SOFT_DISABLE_INTS with RECONCILE_IRQ_STATE	Tiejun Chen
	The SOFT_DISABLE_INTS seems an odd name for something that updates the software state to be consistent with interrupts being hard disabled, so rename SOFT_DISABLE_INTS with RECONCILE_IRQ_STATE to avoid this confusion. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01	Merge tag 'v3.10' into next	Benjamin Herrenschmidt
	Merge 3.10 in order to get some of the last minute powerpc changes, resolve conflicts and add additional fixes on top of them.
2013-07-01	powerpc: Remove KVMTEST from RELON exception handlers	Michael Ellerman
	KVMTEST is a macro which checks whether we are taking an exception from guest context, if so we branch out of line and eventually call into the KVM code to handle the switch. When running real guests on bare metal (HV KVM) the hardware ensures that we never take a relocation on exception when transitioning from guest to host. For PR KVM we disable relocation on exceptions ourself in kvmppc_core_init_vm(), as of commit a413f47 "Disable relocation on exceptions whenever PR KVM is active". So convert all the RELON macros to use NOTEST, and drop the remaining KVM_HANDLER() definitions we have for 0xe40 and 0xe80. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.9+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-15	powerpc: Fix stack overflow crash in resume_kernel when ftracing	Michael Ellerman
	It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873 "powerpc: Rework runlatch code" by myself --BenH ] CC: <stable@vger.kernel.org> [v3.4+] Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26	powerpc: Fix "attempt to move .org backwards" error	Paul Mackerras
	Building a 64-bit powerpc kernel with PR KVM enabled currently gives this error: AS arch/powerpc/kernel/head_64.o arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org backwards make[2]: *** [arch/powerpc/kernel/head_64.o] Error 1 This happens because the MASKABLE_EXCEPTION_PSERIES macro turns into 33 instructions, but we only have space for 32 at the decrementer interrupt vector (from 0x900 to 0x980). In the code generated by the MASKABLE_EXCEPTION_PSERIES macro, we currently have two instances of the HMT_MEDIUM macro, which has the effect of setting the SMT thread priority to medium. One is the first instruction, and is overwritten by a no-op on processors where we save the PPR (processor priority register), that is, POWER7 or later. The other is after we have saved the PPR. In order to reduce the code at 0x900 by one instruction, we omit the first HMT_MEDIUM. On processors without SMT this will have no effect since HMT_MEDIUM is a no-op there. On POWER5 and RS64 machines this will mean that the first few instructions take a little longer in the case where a decrementer interrupt occurs when the hardware thread is running at low SMT priority. On POWER6 and later machines, the hardware automatically boosts the thread priority when a decrementer interrupt is taken if the thread priority was below medium, so this change won't make any difference. The alternative would be to branch out of line after saving the CFAR. However, that would incur an extra overhead on all processors, whereas the approach adopted here only adds overhead on older threaded processors. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-15	powerpc/kvm/book3s_hv: Preserve guest CFAR register value	Paul Mackerras
	The CFAR (Come-From Address Register) is a useful debugging aid that exists on POWER7 processors. Currently HV KVM doesn't save or restore the CFAR register for guest vcpus, making the CFAR of limited use in guests. This adds the necessary code to capture the CFAR value saved in the early exception entry code (it has to be saved before any branch is executed), save it in the vcpu.arch struct, and restore it on entry to the guest. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-15	powerpc: Save CFAR before branching in interrupt entry paths	Paul Mackerras
	Some of the interrupt vectors on 64-bit POWER server processors are only 32 bytes long, which is not enough for the full first-level interrupt handler. For these we currently just have a branch to an out-of-line handler. However, this means that we corrupt the CFAR (come-from address register) on POWER7 and later processors. To fix this, we split the EXCEPTION_PROLOG_1 macro into two pieces: EXCEPTION_PROLOG_0 contains the part up to the point where the CFAR is saved in the PACA, and EXCEPTION_PROLOG_1 contains the rest. We then put EXCEPTION_PROLOG_0 in the short interrupt vectors before we branch to the out-of-line handler, which contains the rest of the first-level interrupt handler. To facilitate this, we define new _OOL (out of line) variants of STD_EXCEPTION_PSERIES, etc. In order to get EXCEPTION_PROLOG_0 to be short enough, i.e., no more than 6 instructions, it was necessary to move the stores that move the PPR and CFAR values into the PACA into __EXCEPTION_PROLOG_1 and to get rid of one of the two HMT_MEDIUM instructions. Previously there was a HMT_MEDIUM_PPR_DISCARD before the prolog, which was nop'd out on processors with the PPR (POWER7 and later), and then another HMT_MEDIUM inside the HMT_MEDIUM_PPR_SAVE macro call inside __EXCEPTION_PROLOG_1, which was nop'd out on processors without PPR. Now the HMT_MEDIUM inside EXCEPTION_PROLOG_0 is there unconditionally and the HMT_MEDIUM_PPR_DISCARD is not strictly necessary, although this leaves it in for the interrupt vectors where there is room for it. Previously we had a handler for hypervisor maintenance interrupts at 0xe50, which doesn't leave enough room for the vector for hypervisor emulation assist interrupts at 0xe40, since we need 8 instructions. The 0xe50 vector was only used on POWER6, as the HMI vector was moved to 0xe60 on POWER7. Since we don't support running in hypervisor mode on POWER6, we just remove the handler at 0xe50. This also changes denorm_exception_hv to use EXCEPTION_PROLOG_0 instead of open-coding it, and removes the HMT_MEDIUM_PPR_DISCARD from the relocation-on vectors (since any CPU that supports relocation-on interrupts also has the PPR). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10	powerpc: Implement PPR save/restore	Haren Myneni
	[PATCH 6/6] powerpc: Implement PPR save/restore When the task enters in to kernel space, the user defined priority (PPR) will be saved in to PACA at the beginning of first level exception vector and then copy from PACA to thread_info in second level vector. PPR will be restored from thread_info before exits the kernel space. P7/P8 temporarily raises the thread priority to higher level during exception until the program executes HMT_* calls. But it will not modify PPR register. So we save PPR value whenever some register is available to use and then calls HMT_MEDIUM to increase the priority. This feature supports on P7 or later processors. We save/ restore PPR for all exception vectors except system call entry. GLIBC will be saving / restore for system calls. So the default PPR value (3) will be set for the system call exit when the task returned to the user space. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10	powerpc: Macros for saving/restore PPR	Haren Myneni
	[PATCH 5/6] powerpc: Macros for saving/restore PPR Several macros are defined for saving and restore user defined PPR value. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10	powerpc: Increase exceptions arrays in paca struct to save PPR	Haren Myneni
	[PATCH 3/6] powerpc: Increase exceptions arrays in paca struct to save PPR Using paca to save user defined PPR value in the first level exception vector. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10	powerpc: Move branch instruction from ACCOUNT_CPU_USER_ENTRY to caller	Haren Myneni
	[PATCH 1/6] powerpc: Move branch instruction from ACCOUNT_CPU_USER_ENTRY to caller The first instruction in ACCOUNT_CPU_USER_ENTRY is 'beq' which checks for exceptions coming from kernel mode. PPR value will be saved immediately after ACCOUNT_CPU_USER_ENTRY and is also for user level exceptions. So moved this branch instruction in the caller code. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10	powerpc: Add book3s privileged doorbell exception vectors	Ian Munsie
	Directed Privileged Doorbell Interrupts come in at 0xa00 (or 0xc000000000004a00 if relocation on exception is enabled), so add exception vectors at these locations. If doorbell support is not compiled in we handle it as an unknown_exception. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Tested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10	powerpc: Add book3s hypervisor doorbell exception vectors	Ian Munsie
	Directed Hypervisor Doorbell Interrupts come in at 0xe80 (or 0xc000000000004e80 if relocation on exceptions is enabled), so add exception vectors at these locations. If doorbell support is not compiled in we handle it as an unknown_exception. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Tested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15	powerpc: Add relocation on exception vector handlers	Michael Neuling
	POWER8/v2.07 allows exceptions to be taken with the MMU still on. A new set of exception vectors is added at 0xc000_0000_0000_4xxx. When the HW takes us here, MSR IR/DR will be set already and we no longer need a costly RFID to turn the MMU back on again. The original 0x0 based exception vectors remain for when the HW can't leave the MMU on. Examples of this are when we can't trust the current MMU mappings, like when we are changing from guest to hypervisor (HV 0 -> 1) or when the MMU was off already. In these cases the HW will take us to the original 0x0 based exception vectors with the MMU off as before. This uses the new macros added previously too implement these new execption vectors at 0xc000_0000_0000_4xxx. We exit these exception vectors using mflr/blr (rather than mtspr SSR0/RFID), since we don't need the costly MMU switch anymore. This moves the __end_interrupts marker down past these new 0x4000 vectors since they will need to be copied down to 0x0 when the kernel is not at 0x0. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15	powerpc: Add new macros needed for relocation on exceptions	Michael Neuling
	POWER8/v2.07 allows exceptions to be taken with the MMU still on. A new set of exception vectors is added at 0xc000_0000_0000_4xxx. When the HW takes us here, MSR IR/DR will be set already and we no longer need a costly RFID to turn the MMU back on again. The original 0x0 based exception vectors remain for when the HW can't leave the MMU on. Examples of this are when we can't trust the current the MMU mappings, like when we are changing from guest to hypervisor (HV 0 -> 1) or when the MMU was off already. In these cases the HW will take us to the original 0x0 based exception vectors with the MMU off as before. The below macros are copies of the macros used at the 0x0 offset but modified to handle the MMU being on. In these macros we use the link register to jump to the secondary handlers rather than using RFID (RFID was also use to turn on the MMU). Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15	powerpc: Make load_hander handle upto 64k offset	Michael Neuling
	If we change load_hander() to use an ori instead of addi, we can load handlers upto 64k away provided we are still 64k aligned. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-11	powerpc: Use CURRENT_THREAD_INFO instead of open coded assembly	Stuart Yoder
	Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-09	powerpc/irq: Make alignment & program interrupt behave the same	Benjamin Herrenschmidt
	Alignment was the last user of the ENABLE_INTS macro, which we can now remove. All non-syscall exceptions now disable interrupts on entry, they get re-enabled conditionally from C code. Don't unconditionally re-enable in program check either, check the original context. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09	powerpc: Rework lazy-interrupt handling	Benjamin Herrenschmidt
	The current implementation of lazy interrupts handling has some issues that this tries to address. We don't do the various workarounds we need to do when re-enabling interrupts in some cases such as when returning from an interrupt and thus we may still lose or get delayed decrementer or doorbell interrupts. The current scheme also makes it much harder to handle the external "edge" interrupts provided by some BookE processors when using the EPR facility (External Proxy) and the Freescale Hypervisor. Additionally, we tend to keep interrupts hard disabled in a number of cases, such as decrementer interrupts, external interrupts, or when a masked decrementer interrupt is pending. This is sub-optimal. This is an attempt at fixing it all in one go by reworking the way we do the lazy interrupt disabling from the ground up. The base idea is to replace the "hard_enabled" field with a "irq_happened" field in which we store a bit mask of what interrupt occurred while soft-disabled. When re-enabling, either via arch_local_irq_restore() or when returning from an interrupt, we can now decide what to do by testing bits in that field. We then implement replaying of the missed interrupts either by re-using the existing exception frame (in exception exit case) or via the creation of a new one from an assembly trampoline (in the arch_local_irq_enable case). This removes the need to play with the decrementer to try to create fake interrupts, among others. In addition, this adds a few refinements: - We no longer hard disable decrementer interrupts that occur while soft-disabled. We now simply bump the decrementer back to max (on BookS) or leave it stopped (on BookE) and continue with hard interrupts enabled, which means that we'll potentially get better sample quality from performance monitor interrupts. - Timer, decrementer and doorbell interrupts now hard-enable shortly after removing the source of the interrupt, which means they no longer run entirely hard disabled. Again, this will improve perf sample quality. - On Book3E 64-bit, we now make the performance monitor interrupt act as an NMI like Book3S (the necessary C code for that to work appear to already be present in the FSL perf code, notably calling nmi_enter instead of irq_enter). (This also fixes a bug where BookE perfmon interrupts could clobber r14 ... oops) - We could make "masked" decrementer interrupts act as NMIs when doing timer-based perf sampling to improve the sample quality. Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- v2: - Add hard-enable to decrementer, timer and doorbells - Fix CR clobber in masked irq handling on BookE - Make embedded perf interrupt act as an NMI - Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want to retrigger an interrupt without preventing hard-enable v3: - Fix or vs. ori bug on Book3E - Fix enabling of interrupts for some exceptions on Book3E v4: - Fix resend of doorbells on return from interrupt on Book3E v5: - Rebased on top of my latest series, which involves some significant rework of some aspects of the patch. v6: - 32-bit compile fix - more compile fixes with various .config combos - factor out the asm code to soft-disable interrupts - remove the C wrapper around preempt_schedule_irq v7: - Fix a bug with hard irq state tracking on native power7
2012-03-09	powerpc: Replace mfmsr instructions with load from PACA kernel_msr field	Benjamin Herrenschmidt
	On 64-bit, the mfmsr instruction can be quite slow, slower than loading a field from the cache-hot PACA, which happens to already contain the value we want in most cases. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09	powerpc: Improve behaviour of irq tracing on 64-bit exception entry	Benjamin Herrenschmidt
	Some exceptions would unconditionally disable interrupts on entry, which is fine, but calling lockdep every time not only adds more overhead than strictly needed, but also means we get quite a few "redudant" disable logged, which makes it hard to spot the really bad ones. So instead, split the macro used by the exception code into a normal one and a separate one used when CONFIG_TRACE_IRQFLAGS is enabled, and make the later skip th tracing if interrupts were already disabled. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09	powerpc: Rework runlatch code	Benjamin Herrenschmidt
	This moves the inlines into system.h and changes the runlatch code to use the thread local flags (non-atomic) rather than the TIF flags (atomic) to keep track of the latch state. The code to turn it back on in an asynchronous interrupt is now simplified and partially inlined. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09	powerpc: Use the same interrupt prolog for perfmon as other interrupts	Benjamin Herrenschmidt
	The perfmon interrupt is the sole user of a special variant of the interrupt prolog which differs from the one used by external and timer interrupts in that it saves the non-volatile GPRs and doesn't turn the runlatch on. The former is unnecessary and the later is arguably incorrect, so let's clean that up by using the same prolog. While at it we rename that prolog to use the _ASYNC prefix. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09	powerpc: Remove legacy iSeries bits from assembly files	Benjamin Herrenschmidt
	This removes the various bits of assembly in the kernel entry, exception handling and SLB management code that were specific to running under the legacy iSeries hypervisor which is no longer supported. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>