diff options
author | Philip Yang <Philip.Yang@amd.com> | 2021-11-18 15:24:55 -0500 |
---|---|---|
committer | Alex Deucher <alexander.deucher@amd.com> | 2021-12-01 16:03:34 -0500 |
commit | 3c2d6ea27955cfac8590884d207353eece8c2cee (patch) | |
tree | 63a0a9fe1f8a640c6299a418a0a250e01046fe6b /drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | |
parent | 232d1d43b522b64266a16606e918ce92a8a0b244 (diff) |
drm/amdgpu: handle IH ring1 overflow
IH ring1 is used to process GPU retry fault, overflow is enabled to
drain retry fault because we want receive other interrupts while
handling retry fault to recover range. There is no overflow flag set
when wptr pass rptr. Use timestamp of rptr and wptr to handle overflow
and drain retry fault.
If fault timestamp goes backward, the fault is filtered and should not
be processed. Drain fault is finished if processed_timestamp is equal to
or larger than checkpoint timestamp.
Add amdgpu_ih_functions interface decode_iv_ts for different chips to
get timestamp from IV entry with different iv size and timestamp offset.
amdgpu_ih_decode_iv_ts_helper is used for vega10, vega20, navi10.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 8 |
1 files changed, 7 insertions, 1 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 08478fce00f2..2430d6223c2d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -350,6 +350,7 @@ static inline uint64_t amdgpu_gmc_fault_key(uint64_t addr, uint16_t pasid) * amdgpu_gmc_filter_faults - filter VM faults * * @adev: amdgpu device structure + * @ih: interrupt ring that the fault received from * @addr: address of the VM fault * @pasid: PASID of the process causing the fault * @timestamp: timestamp of the fault @@ -358,7 +359,8 @@ static inline uint64_t amdgpu_gmc_fault_key(uint64_t addr, uint16_t pasid) * True if the fault was filtered and should not be processed further. * False if the fault is a new one and needs to be handled. */ -bool amdgpu_gmc_filter_faults(struct amdgpu_device *adev, uint64_t addr, +bool amdgpu_gmc_filter_faults(struct amdgpu_device *adev, + struct amdgpu_ih_ring *ih, uint64_t addr, uint16_t pasid, uint64_t timestamp) { struct amdgpu_gmc *gmc = &adev->gmc; @@ -366,6 +368,10 @@ bool amdgpu_gmc_filter_faults(struct amdgpu_device *adev, uint64_t addr, struct amdgpu_gmc_fault *fault; uint32_t hash; + /* Stale retry fault if timestamp goes backward */ + if (amdgpu_ih_ts_after(timestamp, ih->processed_timestamp)) + return true; + /* If we don't have space left in the ring buffer return immediately */ stamp = max(timestamp, AMDGPU_GMC_FAULT_TIMEOUT + 1) - AMDGPU_GMC_FAULT_TIMEOUT; |