summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
AgeCommit message (Collapse)Author
2021-12-13drm/msm/a6xx: Skip crashdumper state if GPU needs_hw_initRob Clark
I am seeing some crash logs which imply that we are trying to use crashdumper hw to read back GPU state when the GPU isn't initialized. This doesn't go well (for example, GPU could be in 32b address mode and ignoring the upper bits of buffer that it is trying to dump state to). I'm not *quite* sure how we get into this state in the first place, but lets not make a bad situation worse by triggering iova fault crashes. While we're at it, also add the information about whether the GPU is initialized to the devcore dump to make this easier to see in the logs (which makes the WARN_ON() redundant and even harmful because it fills up the small bit of dmesg we get with the crash report). Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211209193118.1163248-1-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-29drm/msm/gpu: Snapshot GMU debug bufferRob Clark
It appears to be a GMU fw build option whether it does anything with debug and log buffers, but if they are all zeros it won't add anything to the devcore size. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-10-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-29drm/msm/gpu: Also snapshot GMU HFI bufferRob Clark
This also includes a history of start index of the last 8 messages on each queue, since parsing backwards to decode recently sent HFI messages is hard(ish). Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-9-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-29drm/msm/gpu: Make a6xx_get_gmu_log() more genericRob Clark
Turn it into a thing we can use to snapshot other GMU buffers. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-8-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm/a6xx: Capture gmu log in devcoredumpAkhil P Oommen
Capture gmu log in coredump to enhance debugging. Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-21drm/msm/a6xx: Allocate enough space for GMU registersDouglas Anderson
In commit 142639a52a01 ("drm/msm/a6xx: fix crashstate capture for A650") we changed a6xx_get_gmu_registers() to read 3 sets of registers. Unfortunately, we didn't change the memory allocation for the array. That leads to a KASAN warning (this was on the chromeos-5.4 kernel, which has the problematic commit backported to it): BUG: KASAN: slab-out-of-bounds in _a6xx_get_gmu_registers+0x144/0x430 Write of size 8 at addr ffffff80c89432b0 by task A618-worker/209 CPU: 5 PID: 209 Comm: A618-worker Tainted: G W 5.4.156-lockdep #22 Hardware name: Google Lazor Limozeen without Touchscreen (rev5 - rev8) (DT) Call trace: dump_backtrace+0x0/0x248 show_stack+0x20/0x2c dump_stack+0x128/0x1ec print_address_description+0x88/0x4a0 __kasan_report+0xfc/0x120 kasan_report+0x10/0x18 __asan_report_store8_noabort+0x1c/0x24 _a6xx_get_gmu_registers+0x144/0x430 a6xx_gpu_state_get+0x330/0x25d4 msm_gpu_crashstate_capture+0xa0/0x84c recover_worker+0x328/0x838 kthread_worker_fn+0x32c/0x574 kthread+0x2dc/0x39c ret_from_fork+0x10/0x18 Allocated by task 209: __kasan_kmalloc+0xfc/0x1c4 kasan_kmalloc+0xc/0x14 kmem_cache_alloc_trace+0x1f0/0x2a0 a6xx_gpu_state_get+0x164/0x25d4 msm_gpu_crashstate_capture+0xa0/0x84c recover_worker+0x328/0x838 kthread_worker_fn+0x32c/0x574 kthread+0x2dc/0x39c ret_from_fork+0x10/0x18 Fixes: 142639a52a01 ("drm/msm/a6xx: fix crashstate capture for A650") Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20211103153049.1.Idfa574ccb529d17b69db3a1852e49b580132035c@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-15drm/msm/a6xx: correct cx_debugbus_read argumentsDmitry Baryshkov
First argument of cx_debugbus_read() should be 'void __iomem *' rather than 'void * __iomem' to make sparse happy. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20211002183118.748841-1-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-07-27drm/msm: drop drm_gem_object_put_locked()Rob Clark
No idea why we were still using this. It certainly hasn't been needed for some time. So drop the pointless twin codepaths. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210728010632.2633470-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-23drm/msm: devcoredump iommu fault supportRob Clark
Wire up support to stall the SMMU on iova fault, and collect a devcore- dump snapshot for easier debugging of faults. Currently this is a6xx-only, but mostly only because so far it is the only one using adreno-smmu-priv. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jordan Crouse <jordan@cosmicpenguin.net> Link: https://lore.kernel.org/r/20210610214431.539029-6-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-23drm/msm: replace MSM_BO_UNCACHED with MSM_BO_WC for internal objectsJonathan Marek
msm_gem_get_vaddr() currently always maps as writecombine, so use the right flag instead of relying on broken behavior (things don't actually work if they are mapped as uncached). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Jordan Crouse <jordan@cosmicpenguin.net> Link: https://lore.kernel.org/r/20210423190833.25319-3-jonathan@marek.ca Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-11-29drm/msm/adreno/a6xx_gpu_state: Make some local functions staticLee Jones
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:83:7: warning: no previous prototype for ‘state_kcalloc’ [-Wmissing-prototypes] drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:95:7: warning: no previous prototype for ‘state_kmemdup’ [-Wmissing-prototypes] drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:947:6: warning: no previous prototype for ‘a6xx_gpu_state_destroy’ [-Wmissing-prototypes] Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: linux-arm-msm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-09-12drm/msm/a6xx: fix a potential overflow issueZhenzhong Duan
It's allocating an array of a6xx_gpu_state_obj structure rathor than its pointers. This patch fix it. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-08-22drm/msm/a6xx: add module param to enable debugbus snapshotRob Clark
For production devices, the debugbus sections will typically be fused off and empty in the gpu device coredump. But since this may contain data like cache contents, don't capture it by default. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-07-31drm/msm/a6xx: fix crashstate capture for A650Jonathan Marek
A650 has a separate RSCC region, so dump RSCC registers separately, reading them from the RSCC base. Without this change a GPU hang will cause a system reset if CONFIG_DEV_COREDUMP is enabled. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-01-02drm: msm: a6xx: Dump GBIF registers, debugbus in gpu stateSharat Masetty
Add the relevant GBIF registers and the debug bus to the a6xx gpu state. This comes in pretty handy when debugging GPU bus related issues. Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-06drm: msm: a6xx: fix debug bus register configurationSharat Masetty
Fix the cx debugbus related register configuration, to collect accurate bus data during gpu snapshot. This helps with complete snapshot dump and also complete proper GPU recovery. Fixes: 1707add81551 ("drm/msm/a6xx: Add a6xx gpu state") Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/339165
2018-12-11drm/msm/a6xx: Add a name for the crashdumper bufferJordan Crouse
Add a buffer object name for the a6xx crashdumper so it can be seen with the changes introduced by 7799a98edd ("drm/msm: Add a name field for gem objects"). Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-11drm/msm/a6xx: Use new kernel API free function for gpu stateJordan Crouse
dadb36b7ec42 ("drm/msm: Add a common function to free kernel buffer objects") missed freeing the crashdumper state for a6xx. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-11drm/msm: Count how many times iova memory is pinnedJordan Crouse
Add a reference count to track how many times a particular chunk of iova memory is pinned (mapped) in the iomu and add msm_gem_unpin_iova to give up references. It is important to note that msm_gem_unpin_iova replaces msm_gem_put_iova because the new implicit behavior that an assigned iova in a given vma is now valid for the life of the buffer and what we are really focusing on is the use of that iova. For now the unmappings are lazy; once the reference counts go to zero they *COULD* be unmapped dynamically but that will require an outside force such as a shrinker or mm_notifiers. For now, we're just focusing on getting the counting right and setting ourselves up to be ready for the future. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-11drm/msm/a6xx: Track and manage a6xx state memoryJordan Crouse
The a6xx GPU state allocates a LOT of memory. Add a bit of infrastructure to track the memory allocations in the GPU structure and delete them when the state is destroyed much the same way that devm works with the device model as a whole. This protects against the developer accidentally forgetting to add a kfree() to an ever growing list. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-11drm/msm/a6xx: Add a6xx gpu stateJordan Crouse
Add support for gathering and dumping the a6xx GPU state including registers, GMU registers, indexed registers, shader blocks, context clusters and debugbus. v2: Fix bugs discovered by Sharat Masetty Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>