intel_bufmgr.h is installed in ${includedir} directly, and the other
headers are taken care of by libdrm.pc's Cflags.
Signed-off-by: Julien Cristau <jcristau@debian.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
This is the largest untiled pitch requirement from gen2 through gen4.
It's only the case for gen3 rendering to color regions with depth, but
it's rare for this to be a significant factor in memory usage -- for
example, gen4 requires 1 or 2 times the element size, or up to 64
bytes depending on the size of the elements. This is easier than
encoding all the various little quirks for untiled pitch alignment,
since we rarely do untiled now.
bo->referenced_in_cs is checked if bo is already in cs. Adding and removing
reference in bo is done with atomic operations to allow parallel access to a
bo from multiple contexts.
cs->id generation code quarentees there is not duplicated ids which limits
number of cs->ids to 32. If there is more cs objects rest will get id 0.
V2:
- Fix configure to check for atomics operations if libdrm_radeon is only selected.
- Make atomic operations private to libdrm.
This optimization decreases cs_write_reloc share of torcs profiling from 4.3%
to 2.6%.
Tested-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
intel_atomic.h includes very usefull atomic operations for
lock free parrallel access of variables. Moving these to
core libdrm for code sharing with radeon.
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
Ensure that errors from the kernel are propagated back to the caller,
and not masked with return 0;
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This allows Mesa to use drm_intel_bo_alloc_tiled() for its tiled
buffers, since it makes its decision about pitch before telling
libdrm. They happen to be the same choices for the tiled case.
- Currently reloc'ing a user bo to gart will first cause an allocation in vram,
which is then written to by cpu, then the bo gets moved to gart.
Acked-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Maarten Maathuis <madman2003@gmail.com>
This patch to libdrm adds support for the new execbuf2 ioctl. If
detected, it will be used instead of the old ioctl. By using the new
drm_intel_bufmgr_gem_enable_fenced_relocs(), you can indicate that any
time a fence register is actually required for a relocation target you
will call drm_intel_bo_emit_reloc_fence instead of
drm_intel_bo_emit_reloc, which will reduce fence register pressure.
Signed-off-by: Eric Anholt <eric@anholt.net>
This commit encompasses the changes necessary to run on top of the 0.0.16
nouveau interface, additional APIs to support the new features of the
interface, as well as code from Luca Barbieri to improve the pushbuf
interface, which just happens to break nouveau's libdrm ABI so was delayed
until now.
API changes as a result of 0.0.16 DRM interface:
1. No more bo_pin()/bo_unpin(), these were only there for UMS and we no
longer support it.
2. Any random nouveau_bo can be submitted to the GPU as a push buffer.
3. Relocations can be applied on any nouveau_bo
This patch changes the pushbuffer ABI to:
1. No longer use/expose nouveau_pushbuffer. Everything is directly
in nouveau_channel. This saves the extra "pushbuf" pointer dereference.
2. Use cur/end pointers instead of tracking the remaining size.
Pushing data now only needs to alter cur and not both cur and remaining.
The goal is to make the *_RING macros faster and make the interface simpler
and cleaner in the process.
The *_RING APIs are unchanged, but those are inlined and the ABI is changed.
Also, anything accessing pushbuf->remaining instead of using AVAIL_RING
will need to be fixed.
The SET_TILING is pernicious in that it overwrites the input arguments
following an error in order to report the current tiling state of the
buffer. This caught us by surprise as we then fed those arguments back
into to the ioctl unmodified following an EINTR and so the kernel then
reported success for the no-op. We interpreted this success as meaning
that the tiling on the buffer had changed so updated our state and
started using the buffer incorrectly in the new tiled/untiled manner.
This lead to all sorts of random corruption and GPU hangs, even though
the batch buffers would look sane (when the GPU had not wandered off
into forbidden territory).
References:
Bug 25475 - [i915] Xorg crash / Execbuf while wedged
http://bugs.freedesktop.org/show_bug.cgi?id=25475
Bug 25554 - i830_uxa_prepare_access: gtt bo map failed: Input/output error
http://bugs.freedesktop.org/show_bug.cgi?id=25554
(And probably every other weird bug in the last few months.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the kernel reports the total number of fences, we must guess how many
fences are likely to be pinned. In the typical system these will be only
used by the scanout buffers, of which there may be one per pipe, and any
number of manually pinned fenced buffers. So take a conservative guess
and reserve two fences for use by the system.
Note this reduces the number of fences to 3 for i915 and prior.
Reference:
http://bugs.freedesktop.org/show_bug.cgi?id=25911
The latest intel driver 2.10.0 causes kernel oops and system hangs
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If there is section size mismatch reusing the section object
makes section start fail.
Reseting the object before doing error checking prevents the
possible flood of errors.