Recording the tail pointer in a local variable improves performance, but if
someone messes up and fails to reload at the right time, the driver will
write commands to the wrong part of the ring and scramble execution badly.
This change (available by setting I915_RING_VALIDATE to 1) checks to make
sure the cached tail pointer matches the hardware tail pointer at each ring
buffer addition, calling BUG_ON when that's not true.
There are now 3 lists. Active is buffers currently in the ringbuffer.
Flushing is not in the ringbuffer, but needs a flush before unbinding.
Inactive is as before. This prevents object_free → unbind →
wait_rendering → object_reference and a kernel oops about weird refcounting.
This also avoids an synchronous extra flush and wait when freeing a buffer
which had a write_domain set (such as a temporary rendered to and then from
using the 2d engine). It will sit around on the flushing list until the
appropriate flush gets emitted, or we need the GTT space for another
operation.
This lets us get some qualities we desire, such as using the full 32-bit
range (except zero), avoiding DRM_WAIT_ON, and a 1:1 mapping of active
sequence numbers to request structs, which will be used soon for throttling
and interrupt-driven list cleanup.
It would be nice if one day the DRM driver was the canonical source for
register definitions and core macros. To that end, this patch cleans
things up quite a bit, removing redundant definitions (some with
different names referring to the same register) and generally tidying up
the header file.
Now, the LRU list has objects that are completely done rendering and ready
to kick out, while the execution list has things with active rendering,
which have associated cookies and reference counts on them.
When batch buffers are executing, the ring may be stuck for a long time.
Monitor the ACTHD pointer which will show if the execution engine is
actually hung.
Conflicts:
linux-core/drm_compat.c
linux-core/drm_compat.h
linux-core/drm_ttm.c
shared-core/i915_dma.c
Bump driver minor to 13 due to introduction of new
relocation type.
This change adds a driver feature that for i915 is controlled by a module
parameter. You now need to do insmod i915.ko modeset=1 to enable it the
modesetting paths.
It also fixes up lots of X paths. I can run my new DDX driver on this code
with and without modesetting enabled
As DRM_DEBUG macro already prints out the __FUNCTION__ string (see
drivers/char/drm/drmP.h), it is not worth doing this again. At some
other places the ending "\n" was added.
airlied:- I cleaned up a few that this patch missed also
Document parameters and usage for drm_bo_handle_validate. Change parameter
order to match drm_bo_do_validate (fence_class has been moved to after
flags, hint and mask values). Existing users of this function have been
changed, but out-of-tree users must be modified separately.
One of the costs of superioctl has been the need to perform relocations
inside the kernel. The cost of mapping the buffers to the CPU and writing
data is fairly high, especially if those buffers have been mapped and read
by the GPU.
If we assume that buffers don't move around very often, we can have the
client compute the relocations itself using the previous GPU address. When
that object doesn't move, the kernel can skip computing and writing the
updated data.
Here's a patch which adds a new field to struct drm_bo_info_req called
'presumed_offset', and a new DRM_BO_HINT_PRESUMED_OFFSET that is set when
this field has been filled in by the client.
There are two separate optimizations performed when the presumed_offset is
correct:
1. i915_exec_reloc checks to see if all previous buffer offsets were guessed
correctly. If so, there's no need for it to look at *any* of the
relocations for a buffer. When this happens, it skips the whole
relocation process, simply returning success.
2. i915_apply_reloc checks to see if the target buffer offset was guessed
correctly. If so, it skips mapping the relocatee, computing the
relocation and writing the value. If no relocations are needed, the
relocatee should never be mapped to the CPU, and so the kernel shouldn't
need to wait for any fences to pass.
Conflicts:
linux-core/Makefile.kernel
linux-core/drm_stub.c
linux-core/i915_drv.c
shared-core/i915_dma.c
shared-core/i915_drv.h
Fixup suspend/resume conflicts (basically use what's in DRM master for now).
Also fix up a few other conflicts that snuck in (i915_dma changes etc.).
Add suspend/resume support to the i915 driver. Moves some of the
initialization into the driver load routine, and fixes up places where we
assumed no dev_private existed in some of the cleanup paths. This allows
us to suspend/resume properly even if X isn't running.