Very similar to Evergreen, but slightly different rules for tile / slice
alignment. Fortunately, these map quite naturally onto the previous fixes for
linear aligned layout on SI.
2D tiling still needs more work here and possibly in the kernel.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
To avoid kernel rejecting cs if we return different global name
for same bo keep track of global name and always return the same.
Seems to fix issue with suspend/resume failing and repeatly printing
following message :
[drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -35!
There might still be way for a rogue program to trigger this issue.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
This allows texturing with depth-stencil buffers directly without the copy
to CB. The separate miptree description for stencil is added, because
the stencil mipmap offsets are not really depth offsets/4 (at least
for the texture units).
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Another corner case that isn't well-explained yet.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
If we don't need stencil, don't allocate it.
If we need only stencil (like PIPE_FORMAT_S8_UINT), don't allocate depth.
v2: actually do it correctly
Reviewed-by: Christian König <christian.koenig@amd.com>
On r6xx or evergreen z/stencil surface don't support linear or
linear aligned surface, force 1D tiled mode for those.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Certain cards report the the wrong bank setup which causes
surface init to fail in the ddx and leads to no accel.
If we hit an invalid tiling parameter, just set a default
value and disable 2D tiling.
Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=43448
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is the same list of PCI ids added by Alex Deucher in xf86-video-ati commit
aacbd629b02cbee3f9e6a0ee452b4e3f21376bd3.
This is needed since the addition of the surface allocator helper in
commit c51f7f0e46 ; it needs to differentiate
pre and post-R600 GPUs.
Therefore we should maintain another PCI id list.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=48138
Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only account for the write domain in that case.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=43893 .
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
The mipmap level computation was wrong, we need to know the block
width, height, depth of compressed texture to properly compute this.
Change API to provide block width, height, depth instead of nblk_x,
nblk_y, nblk_z.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
The surface allocator is able to build complete miptree when allocating
surface for r600/r700/evergreen/northern islands GPU family. It also
compute bo size and alignment for render buffer, depth buffer and
scanout buffer.
v2 fix r6xx/r7xx 2D tiling width align computation
v3 add tile split support and fix 1d texture alignment
v4 rework to more properly support compressed format, split surface pixel
size and surface element size in separate fields
v5 support texture array (still issue on r6xx)
v6 split surface value computation and mipmap tree building, rework eg
and newer computation
v7 add a check for tile split and 2d tiled
v8 initialize mode value before testing it in all case, reenable
2D macro tile mode on r6xx for cubemap and array. Fix cubemap
to force array size to the number of face.
v9 fix handling of stencil buffer on evergreen
v10 on evergreen depth buffer need to have enough room for a stencil
buffer just after depth one
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Dump command stream + associated bo into a binary file
which follow a similar design as json file. It allows
to intercept a command stream and replay it in a standalone
program (see radeondb tools).
bo->referenced_in_cs is checked if bo is already in cs. Adding and removing
reference in bo is done with atomic operations to allow parallel access to a
bo from multiple contexts.
cs->id generation code quarentees there is not duplicated ids which limits
number of cs->ids to 32. If there is more cs objects rest will get id 0.
V2:
- Fix configure to check for atomics operations if libdrm_radeon is only selected.
- Make atomic operations private to libdrm.
This optimization decreases cs_write_reloc share of torcs profiling from 4.3%
to 2.6%.
Tested-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
If there is section size mismatch reusing the section object
makes section start fail.
Reseting the object before doing error checking prevents the
possible flood of errors.