Update kernel interface with new I915_GETPARAM ioctl entries for
subslice total and EU total. Add a wrapping function for each
parameter. Userspace drivers need these values when constructing
GPGPU commands. This kernel query method is intended to replace
the PCI ID-based tables that userspace drivers currently maintain.
The kernel driver can employ fuse register reads as needed to
ensure the most accurate determination of GT config attributes.
This first became important with Cherryview in which the config
could differ between devices with the same PCI ID.
The kernel detection of these values is device-specific. Userspace
drivers should continue to maintain ID-based tables for older
devices which return ENODEV when using this query.
v2: remove unnecessary include of <stdbool.h> and increment the
I915_GETPARAM indices to match updated kernel patch.
For: VIZ-4636
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Allow userptr objects to be created and used via libdrm_intel.
At the moment tiling and mapping to GTT aperture is not supported
due hardware limitations across different generations and uncertainty
about its usefulness.
v2: Improved error handling in feature detection per review comments.
v3: Rebase on top of the drm_public addition, minor whitespace addition.
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> (v3)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v1,v2)
The existing 'offset' field is unfortunately typed as 'unsigned long',
which is unfortunately only 4 bytes with a 32-bit userspace.
Traditionally, the hardware has only supported 32-bit virtual addresses,
so even though the kernel uses a __u64, the value would always fit.
However, Broadwell supports 48-bit addressing. So with a 64-bit kernel,
the card virtual address may be too large to fit in the 'offset' field.
Ideally, we would change the type of 'offset' to be a uint64_t---but
this would break the libdrm ABI. Instead, we create a new 'offset64'
field to hold the full 64-bit value from the kernel, and store the
32-bit truncation in the existing 'offset' field, for compatibility.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
I would have just used the drmIoctl interface directly in Mesa, but the
ioctl needs some data from the drm_intel_context that is not exposed
outside libdrm.
This ioctl is in the drm-intel-next tree as b635991.
v2: Update based on Mika's kernel work.
v3: Fix compile failures from last-minute typos. Sigh.
v4: Import the actual changes from the kernel i915_drm.h. Only comments
on some fields of drm_i915_reset_stats differed. There are still some
deltas between the kernel i915_drm.h and the one in libdrm, but those
can be resolved in other patches.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
This reverts commit 6335e1d28c.
No taxation without representation, in other words no userspace without kernel
stuff being in a stable location, either drm-next but I'll accept drm-intel-next
for intel specific stuff.
I would have just used the drmIoctl interface directly in Mesa, but the
ioctl needs some data from the drm_intel_context that is not exposed
outside libdrm.
v2: Update based on Mika's kernel work.
v3: Fix compile failures from last-minute typos. Sigh.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds interfaces for the X driver to use to create a
prime handle from a buffer, and create a bo from a handle.
v2: use Chris's suggested naming (well from at least for consistency)
v3: git commit --amend fail
v4: fix as per Chris's suggestions, group assignments, add get tiling
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
To support this we extract the common execbuf2 functionality to be
called with, or without contexts.
The context'd execbuf does not support some of the dri1 stuff.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
int drm_intel_gem_bo_wait(drm_intel_bo *bo, uint64_t timeout_ns)
This should bump the libdrm version. We're waiting for context support
so we can do both features in one bump.
v2: don't return remaining timeout amount
use get param and fallback for older kernels
v3: only doing getparam at init
prototypes now have a signed input value
v4: update comments
fall back to correct polling behavior with new userspace and old kernel
v5: since the drmIoctl patch was not well received, return appropriate
values in this function instead. As Daniel pointed out, the polling
case (timeout == 0) should also return -ETIME.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch adds a new function,
drm_intel_bufmgr_gem_set_aub_annotations(), which can be used to
annotate the type and subtype of data stored in various sections of
each buffer. This data is used to populate type and subtype fields
when generating the .aub file, which improves the ability of later
debugging tools to analyze the contents of the .aub file.
If drm_intel_bufmgr_gem_set_aub_annotations() is not called, then we
fall back to the old set of annotations (annotate the portion of the
batchbuffer that is executed as AUB_TRACE_TYPE_BATCH, and everything
else as AUB_TRACE_TYPE_NOTYPE).
Reviewed-by: Eric Anholt <eric@anholt.net>
This improves the performance of Mesa's GL_MAP_UNSYNCHRONIZED_BIT path
in GL_ARB_map_buffer_range. Improves Unigine Tropics performance at
1024x768 by 2.30482% +/- 0.0492146% (n=61)
v2: Fix comment grammar.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
This will allow the driver to capture all of its execution state to a
file for later debugging. intel_gpu_dump is limited in that it only
captures batchbuffers, and Mesa's captures, while more complete, still
capture only a portion of the state involved in execution.
This is a squash commit of a long series of hacking as we tried to get
the resulting traces to work in the internal simulator. It contains
contributions by Yuanhan Liu and Kenneth Graunke.
v2: Drop the MI_FLUSH_ENABLE setup.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
For example:
export INTEL_DEVID_OVERRIDE=0x162
If this variable is set, don't actually submit the batchbuffer to the
GPU, it probably contains commands for the wrong generation of hardware.
v2: Introduce a getter for the overridden devid, and avoid getenv per exec.
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Consumers often want to choose stdout vs stderr, and for testing I
want to output to an open_memstream file.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
My plan is to use this drm_intel_dump_batchbuffer() interface for the
current GPU tools, and the current Mesa batch dumping usage, while
eventually building more interesting interfaces for other uses.
Warnings are currently suppressed by using a helper lib with CFLAGS
set manually, because the code is totally not ready for libdrm's warnings
setup.
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Eugeni Dodonov <eugeni@dodonov.net>
There is a per-process limit on the number of vma that the process can
keep open, so we cannot keep an unlimited cache of unused vma's (besides
keeping track of all those vma in the kernel adds considerable overhead).
However, in order to work around inefficiencies in the kernel it is
beneficial to reuse the vma, so keep a MRU cache of vma.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This lets us replace the current inner drawing loop of mesa:
for each prim {
compute bo list
if (check_aperture_space(bo list)) {
batch_flush()
compute bo list
if (check_aperture_space(bo list)) {
whine_about_batch_size()
fall back;
}
}
upload state to BOs
}
with this inner loop:
for each prim {
retry:
upload state to BOs
if (check_aperture_space(batch)) {
if (!retried) {
reset_to_last_prim()
batch_flush()
} else {
if (batch_flush())
whine_about_batch_size()
goto retry;
}
}
}
This avoids having to implement code to walk over certain sets of GL
state twice (the "compute bo list" step). While it's not a
performance improvement, it's a significant win in code complexity:
about -200 lines, and one place to make mistakes related to aperture
space instead of N places to forget some BO we should have included.
Note how if we do a reset in the new loop , we immediately flush. We
don't need to check aperture space -- the kernel will tell us if we
actually ran out of aperture or not. And if we did run out of
aperture, it's because either the single prim was too big, or because
check_aperture was wrong at the point of setting up the last
primitive.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This introduces a new API to exec on BSD ring buffer, for H.264 VLD
decoding.
Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com>
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
This patch to libdrm adds support for the new execbuf2 ioctl. If
detected, it will be used instead of the old ioctl. By using the new
drm_intel_bufmgr_gem_enable_fenced_relocs(), you can indicate that any
time a fence register is actually required for a relocation target you
will call drm_intel_bo_emit_reloc_fence instead of
drm_intel_bo_emit_reloc, which will reduce fence register pressure.
Signed-off-by: Eric Anholt <eric@anholt.net>