summaryrefslogtreecommitdiff
path: root/arch/ia64/include/asm/mca.h
AgeCommit message (Collapse)Author
2009-12-14[IA64] Save I-resources to ia64_sal_os_stateTakao Indoh
This is a patch related to this discussion. http://www.spinics.net/lists/linux-ia64/msg07605.html When INIT is sent, ip/psr/pfs register is stored to the I-resources (iip/ipsr/ifs registers), and they are copied in the min-state save area(pmsa_{iip,ipsr,ifs}). Therefore, in creating pt_regs at ia64_mca_modify_original_stack(), cr_{iip,ipsr,ifs} should be derived from pmsa_{iip,ipsr,ifs}. But current code copies pmsa_{xip,xpsr,xfs} to cr_{iip,ipsr,ifs} when PSR.ic is 0. finish_pt_regs(struct pt_regs *regs, const pal_min_state_area_t *ms, unsigned long *nat) { (snip) if (ia64_psr(regs)->ic) { regs->cr_iip = ms->pmsa_iip; regs->cr_ipsr = ms->pmsa_ipsr; regs->cr_ifs = ms->pmsa_ifs; } else { regs->cr_iip = ms->pmsa_xip; regs->cr_ipsr = ms->pmsa_xpsr; regs->cr_ifs = ms->pmsa_xfs; } It's ok when PSR.ic is not 0. But when PSR.ic is 0, this could be a problem when we investigate kernel as the value of regs->cr_iip does not point to where INIT really interrupted. At first I tried to change finish_pt_regs() so that it uses always pmsa_{iip,ipsr,ifs} for cr_{iip,ipsr,ifs}, but Keith Owens pointed out it could cause another problem if I change it. >The only problem I can think of is an MCA/INIT >arriving while code like SAVE_MIN or SAVE_REST is executing. Back >tracing at that point using pmsa_iip is going to be a problem, you have >no idea what state the registers or stack are in. I confirmed he was right, so I decided to keep it as-is and to save pmsa_{iip,ipsr,ifs} to ia64_sal_os_state for debugging. An attached patch is just adding new members into ia64_sal_os_state to save pmsa_{iip,ipsr,ifs}. Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-09-14[IA64] kexec: Make INIT safe while transition toHidetoshi Seto
kdump/kexec kernel Summary: Asserting INIT on the beginning of kdump/kexec kernel will result in unexpected behavior because INIT handler for previous kernel is invoked on new kernel. Description: In panic situation, we can receive INIT while kernel transition, i.e. from beginning of panic to bootstrap of kdump kernel. Since we initialize registers on leave from current kernel, no longer monarch/slave handlers of current kernel in virtual mode are called safely. (In fact system goes hang as far as I confirmed) How to Reproduce: Start kdump # echo c > /proc/sysrq-trigger Then assert INIT while kdump kernel is booting, before new INIT handler for kdump kernel is registered. Expected(Desirable) result: kdump kernel boots without any problem, crashdump retrieved Actual result: INIT handler for previous kernel is invoked on kdump kernel => panic, hang etc. (unexpected) Proposed fix: We can unregister these init handlers from SAL before jumping into new kernel, however then the INIT will fallback to default behavior, result in warmboot by SAL (according to the SAL specification) and we cannot retrieve the crashdump. Therefore this patch introduces a NOP init handler and register it to SAL before leave from current kernel, to start kdump safely by preventing INITs from entering virtual mode and resulting in warmboot. On the other hand, in case of kexec that not for kdump, it also has same problem with INIT while kernel transition. This patch handles this case differently, because for kexec unregistering handlers will be preferred than registering NOP handler, since the situation "no handlers registered" is usual state for kernel's entry. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Haren Myneni <hbabu@us.ibm.com> Cc: kexec@lists.infradead.org Acked-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-09-14[IA64] kdump: Mask MCA/INIT on frozen cpusHidetoshi Seto
Summary: INIT asserted on kdump kernel invokes INIT handler not only on a cpu that running on the kdump kernel, but also BSP of the panicked kernel, because the (badly) frozen BSP can be thawed by INIT. Description: The kdump_cpu_freeze() is called on cpus except one that initiates panic and/or kdump, to stop/offline the cpu (on ia64, it means we pass control of cpus to SAL, or put them in spinloop). Note that CPU0(BSP) always go to spinloop, so if panic was happened on an AP, there are at least 2cpus (= the AP and BSP) which not back to SAL. On the spinning cpus, interrupts are disabled (rsm psr.i), but INIT is still interruptible because psr.mc for mask them is not set unless kdump_cpu_freeze() is not called from MCA/INIT context. Therefore, assume that a panic was happened on an AP, kdump was invoked, new INIT handlers for kdump kernel was registered and then an INIT is asserted. From the viewpoint of SAL, there are 2 online cpus, so INIT will be delivered to both of them. It likely means that not only the AP (= a cpu executing kdump) enters INIT handler which is newly registered, but also BSP (= another cpu spinning in panicked kernel) enters the same INIT handler. Of course setting of registers in BSP are still old (for panicked kernel), so what happen with running handler with wrong setting will be extremely unexpected. I believe this is not desirable behavior. How to Reproduce: Start kdump on one of APs (e.g. cpu1) # taskset 0x2 echo c > /proc/sysrq-trigger Then assert INIT after kdump kernel is booted, after new INIT handler for kdump kernel is registered. Expected results: An INIT handler is invoked only on the AP. Actual results: An INIT handler is invoked on the AP and BSP. Sample of results: I got following console log by asserting INIT after prompt "root:/>". It seems that two monarchs appeared by one INIT, and one panicked at last. And it also seems that the panicked one supposed there were 4 online cpus and no one did rendezvous: : [ 0 %]dropping to initramfs shell exiting this shell will reboot your system root:/> Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=0 ia64_init_handler: Promoting cpu 0 to monarch. Delaying for 5 seconds... All OS INIT slaves have reached rendezvous Processes interrupted by INIT - 0 (cpu 0 task 0xa000000100af0000) : <<snip>> : Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=1 Delaying for 5 seconds... mlogbuf_finish: printing switched to urgent mode, MCA/INIT might be dodgy or fail. OS INIT slave did not rendezvous on cpu 1 2 3 INIT swapper 0[0]: bugcheck! 0 [1] : <<snip>> : Kernel panic - not syncing: Attempted to kill the idle task! Proposed fix: To avoid this problem, this patch inserts ia64_set_psr_mc() to mask INIT on cpus going to be frozen. This masking have no effect if the kdump_cpu_freeze() is called from INIT handler when kdump_on_init == 1, because psr.mc is already turned on to 1 before entering OS_INIT. I confirmed that weird log like above are disappeared after applying this patch. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Haren Myneni <hbabu@us.ibm.com> Cc: kexec@lists.infradead.org Acked-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-06-17[IA64] Convert ia64 to use int-ll64.hMatthew Wilcox
It is generally agreed that it would be beneficial for u64 to be an unsigned long long on all architectures. ia64 (in common with several other 64-bit architectures) currently uses unsigned long. Migrating piecemeal is too painful; this giant patch fixes all compilation warnings and errors that come as a result of switching to use int-ll64.h. Note that userspace will still see __u64 defined as unsigned long. This is important as it affects C++ name mangling. [Updated by Tony Luck to change efi.h:efi_freemem_callback_t to use u64 for start/end rather than unsigned long] Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-08-01[IA64] Move include/asm-ia64 to arch/ia64/include/asmTony Luck
After moving the the include files there were a few clean-ups: 1) Some files used #include <asm-ia64/xyz.h>, changed to <asm/xyz.h> 2) Some comments alerted maintainers to look at various header files to make matching updates if certain code were to be changed. Updated these comments to use the new include paths. 3) Some header files mentioned their own names in initial comments. Just deleted these self references. Signed-off-by: Tony Luck <tony.luck@intel.com>