Add assertions to drm_intel_gem_bo_reference,
drm_intel_gem_bo_reference_locked and drm_intel_gem_bo_unreference_locked
that the object has not been freed (refcount > 0). Mistakes in refcounting
lead to attempts to insert a bo into a free list more than once which causes
application failure as empty free lists are dereferenced as buffer objects.
Signed-off-by: Keith Packard <keithp@keithp.com>
This patch speeds up drmModeGetConnector by pre-allocating mode &
property info space before calling into the kernel. In many cases this
pre-allocation will be sufficient to hold the returned values (it's easy
enough to tweak if the common case becomes larger), which means we don't
have to make the second call, which saves a lot of time.
Acked-by: Jakob Bornecrantz <wallbraker@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
libdrm has some support for GTT mapping already, but there are bugs
with it (no surprise since it hasn't been used much).
In fixing 20803, I found that sharing bo_gem->virtual was a bad idea,
since a previously mapped object might not end up getting GTT mapped,
leading to corruption. So this patch splits the fields according to
use, taking care to unmap both at free time (but preserving the map
caching).
There's still a risk we might run out of mappings (there's a sysctl
tunable for max number of mappings per process, defaulted to 64k or so
it looks like) but at least GTT maps will work with these changes (and
some others for fixing PAT breakage in the kernel).
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
drm_handle_t is defined to be a u32 on linux and a u64 on everything
else. This addresses an issue on FreeBSD amd64 where the map offsets
may be greater than 32bits. When the handle is cast to 32bit, mmap
cannot match the requested map and causes X to crash.
This should be a NOOP on linux since drm_handle_t is always 32bit.
Signed-off-by: Robert Noland <rnoland@2hip.net>
This helps avoid the n^2 performance cost of counting tree size when we
get a lot of relocations into our batch buffer. rgb10text on keithp's laptop
went from 136k glyphs/sec to 234k glyphs/sec.
This avoids using the oldest BO in the BO cache and waiting for it to be
idle before we turn around and render to it with the GPU. Thanks to
Chris Wilson for pointing out how silly we were being.
nouveau_notifier.c had two places where void* was used in arithmetic,
fixed by using char*.
nouveau_dma_wait(), nouveau_notifier_wait_status() and
nouveau_resource_alloc() had signed/unsigned comparison warnings, fixed
by changing the function parameter into an unsigned type.
Signed-off-by: Pekka Paalanen <pq@iki.fi>
This patch tries to use the available fence count to figure out whether a
given batch can succeed or not (just like the aperture check).
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
It is impossible to replace the original semantics of this call purely
in userland, since the fb_id would change.
after discussion with Dr_Jakob
Signed-Off-By: Owain Ainsworth <oga@openbsd.org>
Acked-By: Jakob Bornecrantz <jakob@vmware.com>
Michel caught a case where we might overwrite a success or other return
value with EBUSY, so check the return value before checking for the
timeout condition.
In some cases, vblank interrupts may be disabled or otherwise broken.
The kernel has a 3s timeout builtin to handle these cases, but the X
server's SIGALM for cursor handling may interrupt vblank wait ioctls,
causing libdrm to restart the ioctl, making the kernel's timeout
useless.
This change tracks time across ioctl restarts and returns EBUSY to the
caller if the expected vblank sequence doesn't occur within 1s of the
first call.
Fixes fdo bz #18041, which is caused by a drmWaitVBlank hanging due to
the corresponding pipe getting disabled (thus preventing further events
from coming in).
Remember tiling mode values provided by appplications, and
record tiling mode when creating a buffer from another application. This
eliminates any need to ask the kernel for tiling values and also makes
reused buffers get the right tiling.
Signed-off-by: Keith Packard <keithp@keithp.com>
Applications may actually care if the mapping operation failed, so when
it happens, return an error indication. errno is probably trashed by
fprintf though.
Signed-off-by: Keith Packard <keithp@keithp.com>
The execbuffer ioctl returns ENOMEM when it fails to pin all of the buffers
in the GTT. This is usually caused by the DRM client attempting to use too
much memory in a single request. Dumping out the requested and available
memory values should help point out failures in the DRM code to catch over
commitments of this form.
Signed-off-by: Keith Packard <keithp@keithp.com>
Add mode setting files to libdrm, including xf86drmMode.* and the new
drm_mode.h header. Also add a couple of tests to sanity check the
kernel interfaces and update code to support them.
BO are referenced once by reloc to make sure that they not destroyed
before we get a chance to flush the cmd stream, so we need to unreference
them once in cs submit or cs erase if cs i never submitted so bo can
be destructed.
Relocation now consist of the following informations (in this order) :
handle buffer object handle identifier
start_offset start offset of first data of the buffer object used by the cs
end_offset end offset of last data of the buffer object used by the cs
read_domain read domain (either VRAM, or GTT as GPU is invalid for CS)
write_domain write domain (either VRAM, or GTT as GPU is invalid for CS)
flags flags used for further optimization (like discard previous
buffer content or forget buffer content after cs which can
help in avoiding moving content in or out)
I wanted to avoid doing this, as it's a bunch of churn, but there was a
conflict between the dri_ symbols in libdrm and the symbols that were in
Mesa in 7.2, which broke Mesa 7.2 AIGLX when the 2D driver had loaded new
libdrm symbols. The new naming was recommended by cworth for giving the
code a unique prefix identifying where the code lives.
Additionally, take the opportunity to fix up two API mistakes: emit_reloc's
arguments were in a nonsensical order, and set_tiling lacked the stride
argument that the kernel will want to use soon. API compatibility with
released code is maintained using #defines.
This relies on a new kernel ioctl to get the available aperture size.
In order to provide reasonable performance from dri_bufmgr_check_aperture, we
now require that once a buffer has been used as the target of a relocation,
it gets no further relocations added to it. This cuts the cost of
check_aperture from 10% to 1% in the 3D driver with no code changes, but
slightly complicates our plans for the 2D driver.
Don't count on ioctl returning -errno; use errno directly.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
We want to be able to use the bufmgr from multiple threads for GL, and thus
we need to protect the internal structures.
The pthread-stubs package is used so that programs not linked against
pthreads get weak symbols to stubs and don't eat most of the cost.
When using bufmgr_fake without DRM, the X server idles the ring whenever it
wants to wait for something to complete (brutal, but effective). In this
case, bufmgr_fake must treat the pending fence as having passed. However, it
wasn't recording the fences as it emitted them, nor cleaning buffers as they
passed.
Signed-off-by: Keith Packard <keithp@keithp.com>
We want to be able to use the bufmgr from multiple threads for GL, and thus
we need to protect the internal structures.
The pthread-stubs package is used so that programs not linked against
pthreads get weak symbols to stubs and don't eat most of the cost.
We need a way of getting at the underlying handle for use with mode
setting. We can either export it in the dri_bo object or provide a new
callback to get it.
I'd swapped the operands, so if we weren't in lockstep with the hardware we
said the sequence was always passed. Additionally, a race was available that
we might have failed at recovering from. Instead, I've replaced the logic
with new stuff that should be more robust and not rely on all the parties in
userland following the same IRQ_EMIT() == 1 protocol. Also, in a radical
departure from past efforts, include a long comment describing the failure
modes and how we're working around them.
Thanks to haihao for catching the original issue.