diff options
author | xinhui pan <xinhui.pan@amd.com> | 2021-09-15 09:08:28 +0800 |
---|---|---|
committer | Alex Deucher <alexander.deucher@amd.com> | 2021-09-23 15:17:29 -0400 |
commit | b2fe31cf648156331991333c1d87346321cab056 (patch) | |
tree | c0c39c35bb050b2de1bfd91ddc5a9f0cac3f7c40 /drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | |
parent | 006c26a0f1c851e0693e4bdd5657a687514d21cf (diff) |
drm/amdgpu: Put drm_dev_enter/exit outside hot codepath
We hit soft hang while doing memory pressure test on one numa system.
After a qucik look, this is because kfd invalid/valid userptr memory
frequently with process_info lock hold.
Looks like update page table mapping use too much cpu time.
perf top says below,
75.81% [kernel] [k] __srcu_read_unlock
6.19% [amdgpu] [k] amdgpu_gmc_set_pte_pde
3.56% [kernel] [k] __srcu_read_lock
2.20% [amdgpu] [k] amdgpu_vm_cpu_update
2.20% [kernel] [k] __sg_page_iter_dma_next
2.15% [drm] [k] drm_dev_enter
1.70% [drm] [k] drm_prime_sg_to_dma_addr_array
1.18% [kernel] [k] __sg_alloc_table_from_pages
1.09% [drm] [k] drm_dev_exit
So move drm_dev_enter/exit outside gmc code, instead let caller do it.
They are gart_unbind, gart_map, vm_clear_bo, vm_update_pdes and
gmc_init_pdb0. vm_bo_update_mapping already calls it.
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 11 |
1 files changed, 5 insertions, 6 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 9ff600a38559..a0dec7f211f0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -153,10 +153,6 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr, { void __iomem *ptr = (void *)cpu_pt_addr; uint64_t value; - int idx; - - if (!drm_dev_enter(&adev->ddev, &idx)) - return 0; /* * The following is for PTE only. GART does not have PDEs. @@ -165,8 +161,6 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void *cpu_pt_addr, value |= flags; writeq(value, ptr + (gpu_page_idx * 8)); - drm_dev_exit(idx); - return 0; } @@ -749,6 +743,10 @@ void amdgpu_gmc_init_pdb0(struct amdgpu_device *adev) adev->gmc.xgmi.physical_node_id * adev->gmc.xgmi.node_segment_size; u64 vram_end = vram_addr + vram_size; u64 gart_ptb_gpu_pa = amdgpu_gmc_vram_pa(adev, adev->gart.bo); + int idx; + + if (!drm_dev_enter(&adev->ddev, &idx)) + return; flags |= AMDGPU_PTE_VALID | AMDGPU_PTE_READABLE; flags |= AMDGPU_PTE_WRITEABLE; @@ -770,6 +768,7 @@ void amdgpu_gmc_init_pdb0(struct amdgpu_device *adev) flags |= AMDGPU_PDE_BFS(0) | AMDGPU_PTE_SNOOPED; /* Requires gart_ptb_gpu_pa to be 4K aligned */ amdgpu_gmc_set_pte_pde(adev, adev->gmc.ptr_pdb0, i, gart_ptb_gpu_pa, flags); + drm_dev_exit(idx); } /** |