This fixes VM protection faults.
I have a new piglit test which can iterate over all possible widths, heights,
and depths (including NPOT) and tests mipmapping with various texture targets.
After this is committed, I'll make a new release of libdrm and bump
the libdrm version requirement in Mesa.
Aligns the IB to 8 DWs. The aligns the IB to the
CP fetch size. r6xx also require at least 4 DW
alignment to avoid a hw bug.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RADEON_GEM_WAIT_IDLE is declared DRM_IOW but libdrm
uses it with drmCommandWriteRead instead of drmCommandWrite
which leads to the ioctl being unmatched and returning an
error on at least OpenBSD.
Signed-off-by: Mark Kettenis <kettenis@openbsd.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
v2: Only writte tile index if flags for it is set
v3: Remove useless allow2d scanout flags
v4: Split radeon_drm.h update to its own patch
v5: update against lastest next tree for radeon
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Very similar to Evergreen, but slightly different rules for tile / slice
alignment. Fortunately, these map quite naturally onto the previous fixes for
linear aligned layout on SI.
2D tiling still needs more work here and possibly in the kernel.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
To avoid kernel rejecting cs if we return different global name
for same bo keep track of global name and always return the same.
Seems to fix issue with suspend/resume failing and repeatly printing
following message :
[drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -35!
There might still be way for a rogue program to trigger this issue.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
This allows texturing with depth-stencil buffers directly without the copy
to CB. The separate miptree description for stencil is added, because
the stencil mipmap offsets are not really depth offsets/4 (at least
for the texture units).
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Another corner case that isn't well-explained yet.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
If we don't need stencil, don't allocate it.
If we need only stencil (like PIPE_FORMAT_S8_UINT), don't allocate depth.
v2: actually do it correctly
Reviewed-by: Christian König <christian.koenig@amd.com>
On r6xx or evergreen z/stencil surface don't support linear or
linear aligned surface, force 1D tiled mode for those.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Certain cards report the the wrong bank setup which causes
surface init to fail in the ddx and leads to no accel.
If we hit an invalid tiling parameter, just set a default
value and disable 2D tiling.
Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=43448
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is the same list of PCI ids added by Alex Deucher in xf86-video-ati commit
aacbd629b02cbee3f9e6a0ee452b4e3f21376bd3.
This is needed since the addition of the surface allocator helper in
commit c51f7f0e46 ; it needs to differentiate
pre and post-R600 GPUs.
Therefore we should maintain another PCI id list.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=48138
Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only account for the write domain in that case.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=43893 .
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
The mipmap level computation was wrong, we need to know the block
width, height, depth of compressed texture to properly compute this.
Change API to provide block width, height, depth instead of nblk_x,
nblk_y, nblk_z.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
The surface allocator is able to build complete miptree when allocating
surface for r600/r700/evergreen/northern islands GPU family. It also
compute bo size and alignment for render buffer, depth buffer and
scanout buffer.
v2 fix r6xx/r7xx 2D tiling width align computation
v3 add tile split support and fix 1d texture alignment
v4 rework to more properly support compressed format, split surface pixel
size and surface element size in separate fields
v5 support texture array (still issue on r6xx)
v6 split surface value computation and mipmap tree building, rework eg
and newer computation
v7 add a check for tile split and 2d tiled
v8 initialize mode value before testing it in all case, reenable
2D macro tile mode on r6xx for cubemap and array. Fix cubemap
to force array size to the number of face.
v9 fix handling of stencil buffer on evergreen
v10 on evergreen depth buffer need to have enough room for a stencil
buffer just after depth one
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Dump command stream + associated bo into a binary file
which follow a similar design as json file. It allows
to intercept a command stream and replay it in a standalone
program (see radeondb tools).
bo->referenced_in_cs is checked if bo is already in cs. Adding and removing
reference in bo is done with atomic operations to allow parallel access to a
bo from multiple contexts.
cs->id generation code quarentees there is not duplicated ids which limits
number of cs->ids to 32. If there is more cs objects rest will get id 0.
V2:
- Fix configure to check for atomics operations if libdrm_radeon is only selected.
- Make atomic operations private to libdrm.
This optimization decreases cs_write_reloc share of torcs profiling from 4.3%
to 2.6%.
Tested-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
If there is section size mismatch reusing the section object
makes section start fail.
Reseting the object before doing error checking prevents the
possible flood of errors.
We don't intend libdrm-radeon to become clever enough to
decode cs for all GPU we support. Better to let an external
tool do the job. This will print raw cs in an easy to parse
way.
as Michel pointed out we are exposing too much info for these object
for this to be maintainable going forward.
This patch set minimises the exposed parts of the radeon_bo and
radeon_cs objects to the piece necessary for ddx/mesa to operate
at a decent speed.
The major problem is mesa contains a legacy BO/CS managers which we still
need to expose functionality to, and we really cannot change the API
until we can drop the non-KMS codepaths.
Signed-off-by: Dave Airlie <airlied@redhat.com>