Skip to content

Commit

Permalink
drm/etnaviv: consider completed fence seqno in hang check
Browse files Browse the repository at this point in the history
Some GPU heavy test programs manage to trigger the hangcheck quite often.
If there are no other GPU users in the system and the test program
exhibits a very regular structure in the commandstreams that are being
submitted, we can end up with two distinct submits managing to trigger
the hangcheck with the FE in a very similar address range. This leads
the hangcheck to believe that the GPU is stuck, while in reality the GPU
is already busy working on a different job. To avoid those spurious
GPU resets, also remember and consider the last completed fence seqno
in the hang check.

Reported-by: Joerg Albert <joerg.albert@iav.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
  • Loading branch information
lynxeye-dev committed Dec 23, 2021
1 parent 6dfa2fa commit cdd1569
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 1 deletion.
1 change: 1 addition & 0 deletions drivers/gpu/drm/etnaviv/etnaviv_gpu.h
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,7 @@ struct etnaviv_gpu {

/* hang detection */
u32 hangcheck_dma_addr;
u32 hangcheck_fence;

void __iomem *mmio;
int irq;
Expand Down
4 changes: 3 additions & 1 deletion drivers/gpu/drm/etnaviv/etnaviv_sched.c
Original file line number Diff line number Diff line change
Expand Up @@ -107,8 +107,10 @@ static enum drm_gpu_sched_stat etnaviv_sched_timedout_job(struct drm_sched_job
*/
dma_addr = gpu_read(gpu, VIVS_FE_DMA_ADDRESS);
change = dma_addr - gpu->hangcheck_dma_addr;
if (change < 0 || change > 16) {
if (gpu->completed_fence != gpu->hangcheck_fence ||
change < 0 || change > 16) {
gpu->hangcheck_dma_addr = dma_addr;
gpu->hangcheck_fence = gpu->completed_fence;
goto out_no_timeout;
}

Expand Down

0 comments on commit cdd1569

Please sign in to comment.