Commit Graph

95 Commits (cf7cc62a9817a495264bbb037f0175cef9bd7a53)

Author SHA1 Message Date
Chris Wilson 13e8270504 intel: Print out debugging message following ENOSPC
execbuffer() returns ENOSPC if it cannot fit the batch buffer into the
aperture which is the error we want to diagnose here.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21 22:37:18 +01:00
Chris Wilson f16b4164d6 intel: Scan the cache for old bo once every second.
Rearrange the cache cleanup so that we always scan following a final
unreference, and guard against multiple scans in a single second.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21 22:37:18 +01:00
Chris Wilson 5eec286838 intel: Force stride to be 0 for I915_TILING_NONE.
When allocating a tiled buffer, if we remove the desired tiling mode due
to it being beyond hardware limits, also remove the stride. This ensures
that we only ever use stride 0 with I915_TILING_NONE.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21 22:37:18 +01:00
Chris Wilson 1db22ff741 intel: Defer tiling change to allocation.
As we now expose a method to allocate tiled buffers, it makes more sense
to defer the SET_TILING until required. Besides the slim chance that it
will be a no-op, by delaying the change we are less likely to stall on
waiting for a bound buffer to release a fence register.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21 22:37:18 +01:00
Chris Wilson 056aa9be04 intel: Track tiling stride
We need to inform the kernel if the tiling stride changes and not only
for changes of the tiling mode.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21 14:25:39 +01:00
Eric Anholt 4f7704aea7 intel: Fix several other paths for buffers pointing at themselves. 2010-06-10 09:02:14 -07:00
Eric Anholt 0ec768e67a intel: Add more intermediate sizes of cache buckets between powers of 2.
We had two cases recently where the rounding to powers of two hurt
badly: 4:2:0 YUV HD video frames would round up from 2.2MB to 4MB, and
Urban Terror was hitting aperture size limitations.  For UT, this is
because mipmap trees for power of two texture sizes will land right in
the middle between two cache buckets.

By giving a few more sizes between powers of two, Urban Terror on my
945 ends up consuming 207MB of GEM objects instead of 272MB, and HD
video decode on Ironlake goes from 99MB to 75MB.

cairo-perf-diff of the benchmarks for gl and xlib shows a 1.09x and
1.06x speedup and a 1.07x, 1.08x, and 1.11x slowdown.  From this, I
think this patch was really a no-op in terms of performance for these
CPU-bound workloads.
2010-06-10 08:56:56 -07:00
Chris Wilson e65caeba9e intel: Convert to untiled pitches if surface is too large for tiling.
If the pitch is too large for the hardware to tile, recompute the
required surface size based on the untiled pitch and alignments. For the
older hardware, which has smaller limits and greater restrictions, this
may be a considerable saving in allocation size.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-09 10:18:31 +01:00
Eric Anholt f179137f8f Allow a buffer to point at itself and still get relocs.
I'm using this in experiments with the i965 Mesa driver.
2010-06-07 17:29:57 -07:00
Zou Nan hai 66375fd6e8 intel: Add support for kernel multi-ringbuffer API.
This introduces a new API to exec on BSD ring buffer, for H.264 VLD
decoding.

Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com>
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
2010-06-06 15:50:38 -07:00
Chris Wilson fcf3e616ee intel: Don't change tiling mode unless the kernel reports success.
Fixes:

  Bug 26686 - Some textures are distorted with libdrm 2.4.18 in GTAVC&GTA3
  http://bugs.freedesktop.org/show_bug.cgi?id=26686

This bug continues to haunt me. The kernel SET_TILING ioctl is
inconsistent in its return values when reporting an error. If one of its
sanity checks fail, then the input values are left unchanged. If the
kernel later fails to change the tiling mode, then the input values are
modified to match the current tiling on the object. In short, userspace
cannot trust the return values upon error and so we must assume that
upon error our current tiling mode matches reality and not update.
2010-05-24 18:38:29 +01:00
Chris Wilson a3305b076c Revert "intel: We don't need to take the bufmgr lock whilst mapping."
This reverts commit 7ca558494d.

This was pushed ahead of an essential review of bo level locking in
mesa, without which we cannot know whether removing this lock is safe.
2010-05-13 08:25:56 +01:00
Chris Wilson 07e7589d86 intel: query whether a buffer is reusable.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-11 08:54:06 +01:00
Chris Wilson 7ca558494d intel: We don't need to take the bufmgr lock whilst mapping. 2010-05-06 22:56:41 +01:00
Chris Wilson 3506173ba7 intel: Use the correct size when allocating reloc_target_info array
Thomas tracked down this error with kdm and commit b509640:

==4320== Invalid write of size 8
==4320==    at 0x9A97998: do_bo_emit_reloc (in /usr/lib/libdrm_intel.so.1.0.0)
==4320==    by 0x9A97B9C: drm_intel_gem_bo_emit_reloc (in /usr/lib/libdrm_intel.so.1.0.0)
==4320==    by 0xAED3234: intel_batchbuffer_emit_reloc (in /usr/lib/xorg/modules/dri/i965_dri.so)
==4320==    by 0xAF13827: brw_emit_vertices (in /usr/lib/xorg/modules/dri/i965_dri.so)
==4320==    by 0xAF1F14D: brw_upload_state (in /usr/lib/xorg/modules/dri/i965_dri.so)
==4320==    by 0xAF12122: brw_draw_prims (in /usr/lib/xorg/modules/dri/i965_dri.so)
==4320==    by 0xB256824: vbo_exec_vtx_flush (in /usr/lib/xorg/modules/dri/libdricore.so)
==4320==    by 0xB2523BB: vbo_exec_FlushVertices_internal (in /usr/lib/xorg/modules/dri/libdricore.so)
==4320==    by 0xB252411: vbo_exec_FlushVertices (in /usr/lib/xorg/modules/dri/libdricore.so)
==4320==    by 0xB195A3D: _mesa_PopAttrib (in /usr/lib/xorg/modules/dri/libdricore.so)
==4320==    by 0x8DF0F02: __glXDisp_Render (in /usr/lib/xorg/modules/extensions/libglx.xorg)
==4320==    by 0x8DF517F: __glXDispatch (in /usr/lib/xorg/modules/extensions/libglx.xorg)
==4320==  Address 0x126a8b80 is 0 bytes after a block of size 16,368 alloc'd
==4320==    at 0x4C23E03: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4320==    by 0x9A97A64: do_bo_emit_reloc (in /usr/lib/libdrm_intel.so.1.0.0)
==4320==    by 0x9A97B9C: drm_intel_gem_bo_emit_reloc (in /usr/lib/libdrm_intel.so.1.0.0)
==4320==    by 0xAED3234: intel_batchbuffer_emit_reloc (in /usr/lib/xorg/modules/dri/i965_dri.so)
==4320==    by 0xAF191DB: upload_binding_table_pointers (in /usr/lib/xorg/modules/dri/i965_dri.so)
==4320==    by 0xAF1F14D: brw_upload_state (in /usr/lib/xorg/modules/dri/i965_dri.so)
==4320==    by 0xAF12122: brw_draw_prims (in /usr/lib/xorg/modules/dri/i965_dri.so)
==4320==    by 0xB255EF6: vbo_exec_DrawArrays (in /usr/lib/xorg/modules/dri/libdricore.so)
==4320==    by 0x8DF67A3: __glXDisp_DrawArrays (in /usr/lib/xorg/modules/extensions/libglx.xorg)
==4320==    by 0x8DF0F02: __glXDisp_Render (in /usr/lib/xorg/modules/extensions/libglx.xorg)
==4320==    by 0x8DF517F: __glXDispatch (in /usr/lib/xorg/modules/extensions/libglx.xorg)
==4320==    by 0x446293: ??? (in /usr/bin/Xorg)

which is simply due to only allocating space for the pointers and not
the structs themselves. D'oh.

Reported-by: Thomas Bächler <thomas@archlinux.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-04-11 18:48:31 +01:00
Eric Anholt 7c697b1670 intel: Align untiled buffer pitch to 64B.
This is the largest untiled pitch requirement from gen2 through gen4.
It's only the case for gen3 rendering to color regions with depth, but
it's rare for this to be a significant factor in memory usage -- for
example, gen4 requires 1 or 2 times the element size, or up to 64
bytes depending on the size of the elements.  This is easier than
encoding all the various little quirks for untiled pitch alignment,
since we rarely do untiled now.
2010-03-17 11:15:45 -07:00
Pauli Nieminen 21105bc186 libdrm: Move intel_atomic.h to libdrm core for sharing.
intel_atomic.h includes very usefull atomic operations for
lock free parrallel access of variables. Moving these to
core libdrm for code sharing with radeon.

Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
2010-03-17 11:48:00 +02:00
Chris Wilson a4041e096c intel: Repeat execbuffer if interrupted by signal
Repeat while EINTR, not EAGAIN! One more source of corruption
erradicated, hurray!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-07 14:17:24 +00:00
Eric Anholt 1d4d1e6b13 intel: Only align Y-tiling pitch to the Y tile width.
Fixes piglit depth-tex-modes on gen4.
2010-03-04 16:27:45 -08:00
Chris Wilson 3e21e3ba00 intel: Propagate some more error returns
Ensure that errors from the kernel are propagated back to the caller,
and not masked with return 0;

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-04 21:17:48 +00:00
Eric Anholt 47102863ed intel: Update the needs_fence flag of buffers on the validate list.
Fixes fbo-copyteximage on i915 with texture tiling and execbuf2 fenced
relocs.
2010-03-03 10:08:44 -08:00
Eric Anholt 766fa7909f intel: Don't enable execbuf2 fenced relocs unless we have execbuf2. 2010-03-02 16:04:14 -08:00
Eric Anholt 292a5d73a2 intel: Don't tile-align pitch for untiled buffers.
This allows Mesa to use drm_intel_bo_alloc_tiled() for its tiled
buffers, since it makes its decision about pitch before telling
libdrm.  They happen to be the same choices for the tiled case.
2010-03-02 15:39:37 -08:00
Eric Anholt acbaff24e0 intel: Fix typo in conversion from IS_GEN to bufmgr_gem->gen.
Luckily I caught the bug with the first consumer of the interface.
2010-03-02 15:39:37 -08:00
Eric Anholt a0abb1b14e intel: add a comment about tiled buffer alloc height alignment from Mesa. 2010-03-02 15:39:37 -08:00
Eric Anholt a1f9ea765d intel: Use an integer for chipset generation instead of many conditionals.
Saves a bunch of comparisons in hot paths.
2010-03-02 10:12:46 -08:00
Jesse Barnes b50964027b libdrm/intel: execbuf2 support
This patch to libdrm adds support for the new execbuf2 ioctl.  If
detected, it will be used instead of the old ioctl.  By using the new
drm_intel_bufmgr_gem_enable_fenced_relocs(), you can indicate that any
time a fence register is actually required for a relocation target you
will call drm_intel_bo_emit_reloc_fence instead of
drm_intel_bo_emit_reloc, which will reduce fence register pressure.

Signed-off-by: Eric Anholt <eric@anholt.net>
2010-03-02 10:10:50 -08:00
Eric Anholt f6dc964e1d intel: Add initial support for Sandybridge, and clean up the #defines. 2010-02-25 10:41:03 -08:00
Chris Wilson 4f0f871730 intel: Handle resetting of input params after EINTR during SET_TILING
The SET_TILING is pernicious in that it overwrites the input arguments
following an error in order to report the current tiling state of the
buffer. This caught us by surprise as we then fed those arguments back
into to the ioctl unmodified following an EINTR and so the kernel then
reported success for the no-op. We interpreted this success as meaning
that the tiling on the buffer had changed so updated our state and
started using the buffer incorrectly in the new tiled/untiled manner.
This lead to all sorts of random corruption and GPU hangs, even though
the batch buffers would look sane (when the GPU had not wandered off
into forbidden territory).

References:

  Bug 25475 - [i915] Xorg crash / Execbuf while wedged
  http://bugs.freedesktop.org/show_bug.cgi?id=25475

  Bug 25554 - i830_uxa_prepare_access: gtt bo map failed: Input/output error
  http://bugs.freedesktop.org/show_bug.cgi?id=25554

(And probably every other weird bug in the last few months.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-02-10 09:52:36 +00:00
Chris Wilson fdcde592c2 intel: Account for potential pinned buffers hogging fences
As the kernel reports the total number of fences, we must guess how many
fences are likely to be pinned. In the typical system these will be only
used by the scanout buffers, of which there may be one per pipe, and any
number of manually pinned fenced buffers. So take a conservative guess
and reserve two fences for use by the system.

Note this reduces the number of fences to 3 for i915 and prior.

Reference:
  http://bugs.freedesktop.org/show_bug.cgi?id=25911
  The latest intel driver 2.10.0 causes kernel oops and system hangs

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-02-09 08:37:37 +00:00
Dave Airlie 973d8d6bd0 intel: check return value for calloc 2010-02-02 10:57:12 +10:00
Chris Wilson 08371bc290 intel: Clear virtual after failing to mmap_gtt.
Don't store the error return in bo_gem->gtt_virtual or else we will
attempt to use that as a valid pointer in future mappings.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-08 22:35:24 +00:00
Chris Wilson 9707733a13 intel: Expect caller to guarantee thread-safety of bo during reloc
This removes the foremost prolific user of mutexes in libdrm_intel.so.
The other uses of the bufmgr_gem->mutex to serial access to individual
bos are currently required by Mesa, and are far less frequent.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[anholt: This chunk looks good...]
Acked-by: Eric Anholt <eric@anholt.net>
2009-12-05 11:03:05 +00:00
Chris Wilson 57473c7f52 intel: Free memory before inserting bo into cache.
This has the unfortunate behaviour of releasing our malloc cache, but
the alternative is for X to consume a couple of gigabytes of ram and
die during testing. Fortunately the extra mallocs have little impact on
performance whereas avoiding swap and death, lots.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-02 14:29:23 +00:00
Chris Wilson 792fed1e24 intel: Check and propagate errors from building reloc-tree
Instead of forcing the caller to check after every emit_reloc(), we can
flag the object as being in error, propagating that error upwards through
the relocation tree, and failing the eventual batch buffer execution.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-02 14:18:31 +00:00
Chris Wilson b73612e4fd intel: Repeat execbuffer after EINTR
EAGAIN cannot be raised by the current code, but the system call maybe
interrupted and so return EINTR.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-02 14:17:15 +00:00
Chris Wilson acb4aa6715 intel: Review use of errno.
Hitting this error lead to a segfault:

  intel_bufmgr_gem.c:919: Error mapping buffer 48607 (pixmap):
                          Cannot allocate memory.

because the errno was reused as the function return value after being
reset by the fprintf(), so caller thought the mapping had succeeded. The
convention established by libdrm is that the return value is the
negative errno and that uses of libdrm cannot trust the value of errno
afterwards, but must use the return code.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-02 14:17:04 +00:00
Chris Wilson 9fec2a8cb2 intel: Make bo_reference() inline for internal use.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-02 14:16:32 +00:00
Chris Wilson 9c8ad05e8b intel: Remove the extra reference while validating the reloc tree
Buffers on the relocation tree are guarded by the reference to the batch
object and so do not need an extra reference whilst constructing the
list of execution buffer objects.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-02 14:16:32 +00:00
Chris Wilson 8ffd2e142e intel: Wrap a few more syscalls with EINTR protection
Having been bitten by a missing EINTR check during mmap_gtt(), I thought
it prudent to add some more protection around the ioctls.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-01 13:09:20 +00:00
Chris Wilson b666f4151a intel: Clear bo->used_as_reloc_target flag on destroy
This allows us to keep the assert added in the previous commit that we do
not modify the tree_reloc_size after inserting the buffer into a relocation
tree, which was being hit here:

  #0  0xb78c2424 in __kernel_vsyscall ()
  #1  0xb74f6401 in raise () from /lib/libc.so.6
  #2  0xb74f7b42 in abort () from /lib/libc.so.6
  #3  0xb74ef5a8 in __assert_fail () from /lib/libc.so.6
  #4  0xb737e78b in drm_intel_bo_gem_set_in_aperture_size (bufmgr_gem=<value optimized out>, bo_gem=0x6) at intel_bufmgr_gem.c:373
  #5  0xb737f519 in drm_intel_gem_bo_set_tiling (bo=0xa1030a0, tiling_mode=0xbff6c85c, stride=0) at intel_bufmgr_gem.c:1386
  #6  0xb737f67f in drm_intel_gem_bo_unreference_final (bo=0xa1030a0, time=<value optimized out>) at intel_bufmgr_gem.c:768
  #7  0xb737f5e3 in drm_intel_gem_bo_unreference_locked_timed (bo=0xa1e50d0, time=<value optimized out>) at intel_bufmgr_gem.c:805
  #8  drm_intel_gem_bo_unreference_final (bo=0xa1e50d0, time=<value optimized out>) at intel_bufmgr_gem.c:756
  #9  0xb737fcbb in drm_intel_gem_bo_unreference (bo=0xa1e50d0) at intel_bufmgr_gem.c:821
  #10 0xb737b4e6 in drm_intel_bo_unreference (bo=0x0) at intel_bufmgr.c:80
  #11 0xb7325625 in intel_batch_flush (scrn=0x9d91f78, flush=1) at i830_batchbuffer.c:200

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30 23:07:19 +00:00
Chris Wilson e22fb79772 intel: Apply pessimistic alignment to in-aperture buffer size
For the older chipsets, i.e. pre-i965, which have severe alignment
restrictions for tiled buffers we need to pessimistically assume that we
will waste the size of buffer to meet those alignment constraints.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30 22:18:15 +00:00
Chris Wilson 60aa803dc7 intel: Only store a buffer in the cache if it is retained.
If the kernel immediately frees the backing store for a buffer when
marking it purgeable, then there is not point adding to the cache. Free
it immediately, instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30 20:02:05 +00:00
Kristian Høgsberg 10ce0ec188 Merge remote branch 'origin/master' into libdrm 2009-11-20 17:09:03 -05:00
Kristian Høgsberg 4f57abfe66 Move libdrm/ up one level 2009-11-17 11:15:06 -05:00