Commit Graph

27 Commits (main)

Author SHA1 Message Date
Likun Gao 003eb2a554 tests/amdgpu: add sdma slow copy linear hang test
Issue slow copy linear for sdma to trigger SDMA hang test.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
2022-08-15 06:10:15 +00:00
Likun Gao 3c04686ae5 tests/amdgpu: add sdma corrupted header hang test
Issue corrupted header for sdma to trigger SDMA hang test.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
2022-08-15 06:10:15 +00:00
Flora Cui cc3c80c6ae tests/amdgpu: refactor dispatch/draw test
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
2022-08-15 06:10:15 +00:00
Flora Cui f2314a4871 tests/amdgpu: add draw test for gfx10
Signed-off-by: Flora Cui <flora.cui@amd.com>
2022-03-30 16:39:02 +08:00
Alex Deucher dce623f525 test/amdgpu: only disable deadlock tests on asics without GPU reset
Switch the logic to only disable the tests for asics which don't
have GPU reset support.  This way we don't need to update it
every time we add a new asic which does support it.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-02 16:10:41 +00:00
Feifei Xu 7b844dabf9 tests/amdgpu:retire asic_id check on unsupported cases
Retire the asic_id check for AI family.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
2021-04-12 17:46:31 +08:00
Le Ma 6e10ac07c8 tests/amdgpu: disable unsupported test cases for Arcturus
Acked-by: Huang Rui <ray.huang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-15 13:45:27 -04:00
Eric Engestrom 074947ee4b meson: always define whether headers exist
Combined with -Wundef (added in 75758d2ccf & enforced in ba17673eed),
this provides absolute safety against #ifdef typos.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2020-01-23 17:00:12 +00:00
Flora Cui 31a6ec141a tests/amdgpu: add gfx ring bad slow draw test
for gfx9

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 15:46:53 -05:00
Flora Cui 5e1f6533a0 tests/amdgpu: add gfx ring draw hang test
for gfx9

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 15:46:53 -05:00
Flora Cui 71b9e68d99 tests/amdgpu: add bad slow dispatch test
add gfx/compute bad slow dispatch test for gfx9

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 15:46:53 -05:00
Flora Cui d72b9189c4 tests/amdgpu: add dispatch hang test
add compute/gfx dispatch hang test for gfx9

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 15:46:53 -05:00
Cui, Flora 8c6dbd7938 tests/amdgpu: add deadlock test for sdma
deadlock test for sdma will cause gpu recoverty.
disable the test for now until GPU reset recovery could survive at least
1000 times test.

v2: add modprobe parameter

Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-18 10:49:52 -05:00
Andrey Grodzovsky ba45adb2a1 amdgpu/test: Enable deadlock test for CI family (gfx7)
I retested GPU recovery with  Bonaire ASIC and it works.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
2018-12-11 15:41:06 -05:00
Andrey Grodzovsky 0be850441f amdgpu/test: Disable deadlock tests for all non gfx8/9 ASICs.
Since only for those ASICs gpu reset is enabled by deafult.
Also update disable message and fix identation .

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-02 16:22:49 -04:00
Andrey Grodzovsky b3dec018df amdgpu/test: Add illegal register and memory access test v2
Illegal access will cause CP hang followed by job timeout and
recovery kicking in.
Also, disable the suite for all APU ASICs until GPU
reset issues for them will be resolved and GPU reset recovery
will be enabled by default.

v2:
Add KV to deasbled APUs list and add comments regarding
necessary kernel amdgpu paramteres to run the tests.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-01 14:57:49 -04:00
Andrey Grodzovsky 21f1176458 amdgpu/test: Fix deadlock tests for AI and RV v2
Seems like AI and RV requires uncashed memory mapping to be able
to pickup value written to memory by CPU after the WAIT_REG_MEM
command was already launched.
.
Enable the test for AI and RV.

v2:
Update commit description.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-10-10 14:38:19 -04:00
Likun Gao cc472c5bb3 amdgpu: Disable deadlock test suite for RV
disable deadlock test suite for RV

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-16 13:45:20 -05:00
Eric Engestrom 0926f0af54 meson,configure: include config.h automatically
This will prevent any more missing `#include "config.h"` bug, at the
cost of having to recompile some files that didn't need to be when
changing build options.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-20 18:19:26 +00:00
Eric Engestrom 80f33f4529 tests/amdgpu: drop unused variables
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 15:41:52 +00:00
Michel Dänzer 8e75f5a145 amdgpu: Disable deadlock test suite by default for SI ASICs
Hangs my Cape Verde.

Acked-by: Christian König <christian.koenig@amd.com>
2018-01-26 15:25:17 +01:00
Andrey Grodzovsky 429bb5820d amdgpu: Fix segfault in deadlock test.
If amdgpu_cs_query_fence_status terminates prematurely the BO
sometimes is unmapped before helper thread writes a vlaue
into it causing a segfault.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
2018-01-26 07:45:48 -05:00
Andrey Grodzovsky 5e239f3e3d amdgpu: Update deadlock test to not assert on ECANCELED
Kernel will abort jobs for guilty (causing GPU hang) context
with -ECANCELED don't assert if that the case.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-01-26 07:45:34 -05:00
Michel Dänzer 6fe93b8000 amdgpu: Don't dereference device_handle after amdgpu_device_deinitialize
Fixes use after free:

==2537== Invalid read of size 4
==2537==    at 0x1162C9: suite_deadlock_tests_enable (deadlock_tests.c:101)
==2537==    by 0x10B157: amdgpu_disable_suits (amdgpu_test.c:421)
==2537==    by 0x10B157: main (amdgpu_test.c:560)
==2537==  Address 0x5e44f24 is 452 bytes inside a block of size 1,016 free'd
==2537==    at 0x4C2BE1B: free (vg_replace_malloc.c:530)
==2537==    by 0x504CD8B: amdgpu_device_reference (amdgpu_device.c:164)
==2537==    by 0x504CD8B: amdgpu_device_deinitialize (amdgpu_device.c:307)
==2537==    by 0x1162BB: suite_deadlock_tests_enable (deadlock_tests.c:97)
==2537==    by 0x10B157: amdgpu_disable_suits (amdgpu_test.c:421)
==2537==    by 0x10B157: main (amdgpu_test.c:560)
==2537==  Block was alloc'd at
==2537==    at 0x4C2CC05: calloc (vg_replace_malloc.c:711)
==2537==    by 0x504CA5E: amdgpu_device_initialize (amdgpu_device.c:212)
==2537==    by 0x116298: suite_deadlock_tests_enable (deadlock_tests.c:93)
==2537==    by 0x10B157: amdgpu_disable_suits (amdgpu_test.c:421)
==2537==    by 0x10B157: main (amdgpu_test.c:560)

Reviewed-by: Christian König <christian.koenig@amd.com>
2018-01-16 16:57:57 +01:00
Andrey Grodzovsky 18ffe485cd amdgpu: Disable deadlock test suite for Vega 10
The test stalls the CP, until RCA is done the test is
disabled to not disrupt regression testing.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
2017-11-15 23:28:45 -05:00
Andrey Grodzovsky 806d080360 amdgpu: Use new suite/test disabling functionality.
Switch from disabling tests during run to using the new disable
API.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-11-13 18:20:22 +01:00
Andrey Grodzovsky 670db97dc3 amdgpu: Add deadlock detection test suit.
Adding initial tests for locks detection when SW
scheduler FIFO is full.

The test works by submitting a batch of identical commands which make the CP
stall waiting for condition to become true. The condition is later satisfied
form a helper thread. Other events that happen during this time
might create deadlock situations. One such example is GPU reset
triggered by this stall when  amdgpu_lockup_timeout != 0.

v2:
Increase the delay from 2 to 100 ms.
Comment out the compute test until it's working.
Typos fix.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-10-04 10:50:02 +02:00