ASIC hang randomly.
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. perform gpu reset
2. perform dispatch test to verify gpu reset to a good state
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
v2: drop DRM_SYNCOBJ_CREATE_TYPE_TIMELINE, fix timeout calculation,
fix some warnings
v3: add export/import and cpu signal testing cases
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Saw a couple of typos fixes in the patch DragonFlyBSD carries [1], so
I ran codespell (a spell checker for code) on the whole repo.
[1] https://github.com/DragonFlyBSD/DPorts/blob/master/graphics/libdrm/files/patch-xf86drm.c
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
add memset draw test for gfx9
Signed-off-by: Flora Cui <flora.cui@amd.com>
Tested-by: Rui Teng <rui.teng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add memset dispatch test for gfx9
v2: disable dispatch test for other ASICs
Signed-off-by: Flora Cui <flora.cui@amd.com>
Tested-by: Rui Teng <rui.teng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
deadlock test for sdma will cause gpu recoverty.
disable the test for now until GPU reset recovery could survive at least
1000 times test.
v2: add modprobe parameter
Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Illegal access will cause CP hang followed by job timeout and
recovery kicking in.
Also, disable the suite for all APU ASICs until GPU
reset issues for them will be resolved and GPU reset recovery
will be enabled by default.
v2:
Add KV to deasbled APUs list and add comments regarding
necessary kernel amdgpu paramteres to run the tests.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This will prevent any more missing `#include "config.h"` bug, at the
cost of having to recompile some files that didn't need to be when
changing build options.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
if some system has no swap space and memory is less, than the test
could fail.
And bo eviction test takes much more time in some system, which effects
automation test result and efficiency.
So disable it by default now, only be used by developer manually.
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
The test is as following:
1) Create context A & B
2) Send a command submission using context A which fires up a compute shader.
3) The shader wait a bit and then write a value to a memory location.
4) Send a command submission using context B which writes another value to the same memory location, but having an explicit dependency on the first command submission.
5) Wait with the CPU for both submissions to finish and inspect the written value.
Test passes if the value seen in the memory location after both submissions is from command B.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Disable the test instead of doing #ifdef 0
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
The test stalls the CP, until RCA is done the test is
disabled to not disrupt regression testing.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Switch from disabling tests during run to using the new disable
API.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Suits are diasbled based on hooks they provide (e.g incompatible
ASIC or missing blocks). Single tests are diasbled explicitly.
Suit or test can be forced to execute even if disabled by adding -f
flag after specifying suit [test] ids.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
The test will Reserve a VMID, submit a command and
unreserve the VMID.
v2:
Wrappers names were changed.
Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Adding initial tests for locks detection when SW
scheduler FIFO is full.
The test works by submitting a batch of identical commands which make the CP
stall waiting for condition to become true. The condition is later satisfied
form a helper thread. Other events that happen during this time
might create deadlock situations. One such example is GPU reset
triggered by this stall when amdgpu_lockup_timeout != 0.
v2:
Increase the delay from 2 to 100 ms.
Comment out the compute test until it's working.
Typos fix.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher at amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
The device_id option [-d] was badly broken. This commit fixes
the width (was 8 is now 16 bits) as well as enables searches
without specifying a bus id. It was also comparing "dev" from
the bus field which is not the PCI device id.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Tested:
1. As root, tests passed on primary.
2. As root, tests passed on render node.
BO export/import test was skipped
3. As non-privileged user, tests failed on primary as expected.
4. As non-privileged user, tests passed on render node.
BO export/import test was skipped
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This can be used to test multiple GPUs
v2: Use PCI bus ID and optional PCI device ID to choose device
Add an option to display information of AMDGPU devices
Tested:
./amdgpu_test -p
./amdgpu_test
./amdgpu_test -b 1 #fail as expected
./amdgpu_test -b 6 #pass
./amdgpu_test -b -d 1 #fail as expected
./amdgpu_test -b -d 0 #pass
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Verify the vender ID and driver name.
Open all AMDGPU devices.
Provide an option to open render node.
Tested as root: PASS
Tested as non-privileged user:
All tests failed as expected
v2: Return value in the ene of function amdgpu_open_devices.
Check the return value of amdgpu_open_devices.
amdgpu_test is not for USB device for the time being.
Get the name of node from function drmGetDevices2.
Drop the legacy drmAvailable() from the test.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In the latest version of CUnit the fourth parameter of the CU_SuiteInfo
struct is pSetUpFunc rather than *pTests.
Seems like the CUnit ABI broke at some point, so let's the the robust
thing and use c99 designated initializers to correctly populate the
struct(s).
Cc: Leo Liu <leo.liu@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This adds some basic unit tests for the new amdgpu driver.
v2: use common util_math.h
v3: implement suggestions from Emil
replace malloc/memset with calloc
make header guards all caps
use posix_memalign rather than mem_align
replace malloc with calloc for pm4 allocations
make CU_SuiteInfo static
fix Makefile.am
fix fd numbering
use drmGetVersion/drmFreeVersion rather than open coding it
close fd, clean up CU registry on error
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>