Using this call in OUT_BATCH_TABLE reduces radeonEmitState cpu usage from
9% to 5% and emit_vpu goes from 7% to 1.5%. I did use calgrind to profile
gears for cpu hotspots with r500 card.
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
- The bo was mapped with sysmem == NULL, so this means cpu prep is called.
- The bo was unmapped with sysmem != NULL, so this means cpu finish is not called.
- This can lead to a non-zero "cpu writers" count in ttm_bo.
The goal of the BO cache is to keep buffers on hand for fast continuous use,
as in every frame of a game or every batchbuffer of the X Server. Keeping
older buffers on hand not only doesn't serve this purpose, it may hurt
performance by resulting in disk cache getting kicked out, or even driving
the system to swap.
Bug #20766.
The logbase2 would overflow and wrap the size around to 0, making the code
allocate a 4kb object instead. By simplifying the code to just walk the
14-entry bucket array comparing sizes instead of indexing on
ffs(1 << logbase2(size)), we avoid silly math errors and have code of
approximately the same speed.
Many thanks to Simon Farnsworth for debugging and providing a working patch.
Bug #27365.
Normal map() should operate as before, and map_range()/map_flush() should
give correct results but lacking any performance difference from map().
Nothing exiting being done here yet, but the interface is a good start.
This avoids making objects significantly bigger than they would be
otherwise, which would result in some failing at binding to the GTT.
Found from firefox hanging on:
http://upload.wikimedia.org/wikipedia/commons/b/b7/Singapore_port_panorama.jpg
due to a software fallback trying to do a GTT-mapped copy between two 73MB
BOs that were instead each 128MB, and failing because both couldn't fit
simultaneously.
The cost here is that we get no opportunity to cache these objects and
avoid the mapping. But since the objects are a significant percentage
of the aperture size, each mapped access is likely having to fault and rebind
the object most of the time anyway.
Bug #20152 (2/3)
The convention is that all APIs are per-bufmgr, so make this one the same.
Then, have it return -1 on failure so that the application can know what's
going on and do something sensible.
Signed-off-by: Keith Packard <keithp@keithp.com>
This wraps the new DRM_IOCTL_I915_GET_PIPE_FROM_CRTC_ID ioctl,
allowing applications to discover the pipe number corresponding
to a given CRTC ID. This is necessary for doing pipe-specific
operations such as waiting for vblank on a given CRTC.
Scanout buffers need to be freed through the kernel as it holds a reference
to them; exposing this API allows applications allocating scanout buffers to
flag them as not reusable.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Add assertions to drm_intel_gem_bo_reference,
drm_intel_gem_bo_reference_locked and drm_intel_gem_bo_unreference_locked
that the object has not been freed (refcount > 0). Mistakes in refcounting
lead to attempts to insert a bo into a free list more than once which causes
application failure as empty free lists are dereferenced as buffer objects.
Signed-off-by: Keith Packard <keithp@keithp.com>
This patch speeds up drmModeGetConnector by pre-allocating mode &
property info space before calling into the kernel. In many cases this
pre-allocation will be sufficient to hold the returned values (it's easy
enough to tweak if the common case becomes larger), which means we don't
have to make the second call, which saves a lot of time.
Acked-by: Jakob Bornecrantz <wallbraker@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
libdrm has some support for GTT mapping already, but there are bugs
with it (no surprise since it hasn't been used much).
In fixing 20803, I found that sharing bo_gem->virtual was a bad idea,
since a previously mapped object might not end up getting GTT mapped,
leading to corruption. So this patch splits the fields according to
use, taking care to unmap both at free time (but preserving the map
caching).
There's still a risk we might run out of mappings (there's a sysctl
tunable for max number of mappings per process, defaulted to 64k or so
it looks like) but at least GTT maps will work with these changes (and
some others for fixing PAT breakage in the kernel).
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>