When i915_gem_retire_request has a flush which matches an object write
domain, clear the write domain. This will move the object to the inactive
list rather than the flushing list, avoiding trouble with objects left stuck
on the flushing list.
In i915_gem_object_wait_rendering, if the object write domain is being
written by the GPU, the appropriate flushing commands are written to the
device and an additional request queued to mark that flush. Finally, the
function blocks on that new request.
The bug was that the write_domain in the object was cleared before the
function blocked.
If the wait is interrupted by a signal, the flushing commands may still be
pending. With the current write_domain information lost, the restarted
syscall will drop right through the write_domain test as that value was
lost, and so the function will not block at all. Oops.
Fixed by simply moving the write_domain clear until after the wait_request
succeeds. Note that the restarted system call will generate an additional
flush sequence and request, but that should be 'harmless', aside from a
slight performance impact.
Someday we'll track flushing more accurately and clear write_domains more
efficiently, but for now, this should suffice.
This bug was discovered in the 2d gem development by running x11perf
-copypixwin500 and noticing that the window got cleared accidentally.
This makes our handling of cliprects sane. drm_clip_rect always has exclusive
bottom-right corners, but the hardware expects inclusive bottom-right corners,
so we adjust this here.
This complements Michel Daenzer's commit 57aea290e1e0a26d1e74df6cff777eb9f038f1f8
to Mesa. See also http://bugs.freedesktop.org/show_bug.cgi?id=16123 .
Main fix is an oops that was triggered by the gtt pwrite path when we don't
have the gtt initialized. Also, settle on -EBADF for "bad object handle",
and -EINVAL for "reading/writing beyond object boundary".
This is around 3x or so speedup, since we would read wide rows at a time, and
clflush each tile 8 times as a result. We'll want code related to this anyway
when we do fault-based per-page clflushing for sw fallbacks.
DRAW_INDEX writes a vertex count to VAP_VF_CNTL. Docs say that behaviour
is undefined (i.e. lockups happen) when this write is not followed by the
right number of vertex indices.
Thus we used to do the wrong thing when drawing across many cliprects was
necessary, because we emitted a sequence
DRAW_INDEX, DRAW_INDEX, INDX_BUFFER, INDX_BUFFER
instead of
DRAW_INDEX, INDX_BUFFER, DRAW_INDEX, INDX_BUFFER
The latter is what we're doing now and which ought to be correct.
this lets us debug the X server through xkb startup.
Not sure what the correct answer is, probably X needs to drop
the lock when execing stuff, with input hotplug it can get
xkb stuff at any time I believe.
This resolves a panic on FreeBSD which was caused by trying
to re-initialize the swap lock. It's just much easier to
initialize all of the locks at load time. It should also
ensure that the vblank structures are available earlier.
This increases overhead for the large-readpixels case due to the repeated
page cache accessing, but greatly reduces overhead for the small-readpixels
case.