This is similar to b81d44d587d1706d5c7568e539340632a748782b: the
DRM_RDWR flag is needed for mmap to work.
Signed-off-by: Nicholas Bishop <nicholasbishop@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Platforms without a HW detiler doesn't support the get_tiling IOCTL.
Fix the drm_intel_bo_gem_create_from_* functions assuming the default
no-tiling, no-swizzling setting for the GEM buffer in this case.
v2:
- Add the missing gem handle IOCTL parameter. (Eric)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
This was done with:
while read sym; do
read f func line _ <<<$(cscope -d -L -1 $sym)
if [ ! -z "$f" ]; then
line=$((line-1))
sed -i "${line}s/^/drm_public /" $f
fi
done < /tmp/a.txt
Then some corner cases were manually fixed. "a.txt" above contains the
symbols collected from intel/intel-symbol-check. The idea here will be
to switch the default visibility to hidden so we don't export symbols we
shouldn't.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We don't need to call IS_GEN() for each gen >= 9: we can rather use the
new intel_is_genx() helper to iterate the pciids array once.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Add the PCI IDs and the basic code to enable ICL. This is the current
PCI ID list in our documentation.
Kernel commit: d55cb4fa2cf0 ("drm/i915/icl: Add the ICL PCI IDs")
v2: Michel provided a fix to IS_9XX that was broken by rebase bot.
v3: Fix double definition of PCI IDs, update IDs according to bspec
and keep them in the same order and rebase (Lucas)
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
This will prevent any more missing `#include "config.h"` bug, at the
cost of having to recompile some files that didn't need to be when
changing build options.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
drm_intel_gem_bo_free() unconditionally attempts to remove the handle
from the hashtable. This goes horribly wrong if we haven't already added
the bo to the hashtable.
Reported-by: Michael Thayer <michael.thayer@oracle.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Use obj->kflags to set EXEC_OBJECT_PINNED when the object is softpinned,
and so remember to clear the softpin status when the object is freed
(and reused).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Implicit synchronisation is the default behaviour of the kernel when
rendering with an execobject. It may be disabled with
drm_intel_gem_bo_disable_implicit_sync(), and then to restore it use
drm_intel_gem_bo_enable_implicit_sync().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Allow the caller to pass in an fd to an array of fences to control
serialisation of the execbuf in the kernel and on the GPU, and in return
allow creation of a fence fd for signaling the completion (and flushing)
of the batch. When the returned fence is signaled, all writes to the
buffers inside the batch will be complete and coherent from the cpu, or
other consumers. The return fence is a sync_file object and can be
passed to other users (such as atomic modesetting, or other drivers).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The kernel allows implicit synchronisation to be disabled on individual
buffers. Use at your own risk.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
bo->global_name should be updated first before a hash value
for the entry is calculated with it by HASH_ADD macro.
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Just some trivial boring typo fixes all over the tree.
READMEs and comments only.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Exposing the u32 context ID makes it possible to define new drm kernel
interfaces based on the same IDs that e.g. execbuf uses to identify a
gem context, that aren't themselves abstracted by libdrm but need to be
used by libdrm/drm_intel_context based clients such as (parts of) i-g-t
or Mesa.
For example this can be used to configure an i915-perf stream to collect
metrics for a specific context.
v2: s/drm_intel_gem_context_get_context_id/drm_intel_gem_context_get_id/
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
As well as allowing a hexadecimal PCI ID number, the
INTEL_DEVID_OVERRIDE environment variable can now contain one of a few
short codenames. The codenames are stored in a small table to map them
to a corresponding PCI ID. This makes it easier to use without having
to look up the PCI IDs manually.
The PCI IDs used are the same as those chosen for the -p option of
run.c in shader-db but SKL has been added as well.
Reviewed-by: Matt Turner <mattst88@gmail.com>
A slightly confused copy'n'paste from the open path where we pass in
handle but use it as a global name, in the prime handle-from-fd pass we
pass in handle and do mean handle!
References: https://bugs.freedesktop.org/show_bug.cgi?id=98416
Fixes: 2f23bf1b7b89 ("intel: Migrate handle/name lookups from linear lists...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Walking a linear list to find a matching PRIME handle or flinked name
does not scale and becomes a major burden with just a few objects.
That said, the fixed size hash is not much better, it just buckets the
look into a few separate chains rather than one long one.
References: https://bugs.freedesktop.org/show_bug.cgi?id=94631
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Export a set of interfaces to allow the caller to have precise control
over mapping the buffer - but still provide caching of the mmaps between
callers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Update kernel interface with new I915_GETPARAM ioctl entries for
pooled EU and min no. of eus in a pool. Add a wrapping function
for each parameter. Userspace drivers need these values when decide
the thread count. This kernel enabled pooled eu by default for BXT
and for fused down 2x6 parts it is advised to turn it off.
But there is another HW issue in these parts (fused
down 2x6 parts) before C0 that requires Pooled EU to be enabled as a
workaround. In this case the pool configuration changes depending upon
which subslice is disabled and the no. of eus in a pool is different,
So userspace need to know min no. of eus in a pool.
V2: use return value as the query results.
ret < 0 when error, ret = 0 when not support, and ret > 0 indicate
query results.(Chris)
V3: Correct V2 errors.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Softpin allows userspace to take greater control of GPU virtual address
space and eliminates the need of relocations. It can also be used to
mirror addresses between GPU and CPU (shared virtual memory).
Calls to drm_intel_bo_emit_reloc are still required to build the list of
drm_i915_gem_exec_objects at exec time, but no entries in relocs are
created. Self-relocs don't make any sense for softpinned objects and can
indicate a programming errors, thus are forbidden. Softpinned objects
are marked by asterisk in debug dumps.
Cc: Thomas Daniel <thomas.daniel@intel.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Zou Nanhai <nanhai.zou@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Gen8+ supports 48-bit virtual addresses, but some objects must always be
allocated inside the 32-bit address range.
In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
General State Heap (GSH) or Instruction State Heap (ISH) must be in a
32-bit range, because the General State Offset and Instruction State Offset
are limited to 32-bits.
The i915 driver has been modified to provide a flag to set when the 4GB
limit is not necessary in a given bo (EXEC_OBJECT_SUPPORTS_48B_ADDRESS).
48-bit range will only be used when explicitly requested.
Callers to the existing drm_intel_bo_emit_reloc function should set the
use_48b_address_range flag beforehand, in order to use full ppgtt range.
v2: Make set/clear functions nops on pre-gen8 platforms, and use them
internally in emit_reloc functions (Ben)
s/48BADDRESS/48B_ADDRESS/ (Dave)
v3: Keep set/clear functions internal, no-one needs to use them directly.
v4: Don't set 48bit-support flag in emit reloc, check for ppgtt type
before enabling set/clear function, print full offsets in debug
statements, using port of lower_32_bits and upper_32_bits from linux
kernel (Michał)
References: http://lists.freedesktop.org/archives/intel-gfx/2015-July/072612.html
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
...to minimise misuse of bo_gem.
If the variable is declared at the top of the function and then used
for two (or more) different contexts this can cause confusion and errors.
Just introduce a wrapper, which can be used in a once off situations.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by Chris Wilson <chris@chris-wilson.co.uk>
Just like we do for the original exec()
v2: move bo_gem declaration to the top of the function.
Cc: intel-gfx@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by Chris Wilson <chris@chris-wilson.co.uk>
v2: keep the bo_gem declaration in exec2() within the loop (Chris)
Cc: intel-gfx@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by Chris Wilson <chris@chris-wilson.co.uk>
It is possible to hit a race condition in create_from_prime, when trying
to import a BO that's currently being freed. In case of prime sharing
we'll succesfully get a handle, but fail on get_tiling call, potentially
confusing the caller (and requiring different locking scheme than with
sharing using flink). Wrap fd_to_handle with struct_mutex to force
a more consistent behaviour between prime/flink, convert fprintf to DBG
when handling errors.
(From Chris:
The race is that the kernel returns us the same file-private handle as
the first thread, but that first thread is about to call gem_close
(thereby removing the handle from the file completely) and does so
between us acquiring the handle and taking the mutex. If we take
the mutex, then we acquire the refcnt on the bo prior to the first
thread completing its unref (and so preventing the early close). Or we
acquire the handle after the earlier close, in which case we are the new
owner.
)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Testcase: igt/drm_import_export/import-close-race-prime
Signed-off-by: Rafał Sapała <rafal.a.sapala@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
We now have a separate tool for this in intel-gpu-tools and we don't
need to clutter up libdrm with this feature. We leave the entry points
in there to avoid breaking API/ABI.
Install intel-gpu-tools, then run (for example)
$ intel_aubdump --output=trace.aub glxgears -geometry 500x500
See the intel_aubdump man page for more details.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
In case of YF/YS tiled buffers libdrm need not know about the tiling
format because these buffers don't have hardware support to be tiled
or detiled through a fenced region. But, libdrm still need to know
about buffer alignment restrictions because kernel uses it when
resolving the relocation.
Mesa uses drm_intel_gem_bo_alloc_for_render() to allocate Yf/Ys buffers.
So, use the passed alignment value in this function to initialize the
align variable in drm_intel_bo. Note that we continue ignoring the
alignment value passed to drm_intel_gem_bo_alloc() to follow the
previous behavior.
V2: Add a condition to avoid allocation from cache. (Ben)
V3: Make no changes in cache allocation strategy. Just update the alignment.
Update the aperture size estimate including the alignment. (Ben, Chris)
V4: Move aperture size adjustments inside drm_intel_bo_gem_set_in_aperture_size()
Don't split sentences across the one-line header and the changelog. (Chris)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
In order to use userptr, the kernel tracks the owner's mm with a
mmu_notifier. Setting that is very expensive - it involves taking all
mm_locks and a stop_machine(). This tracking lives only for as long as
the client is using userptr objects - so if the client allocates then
frees a userptr in a loop, we will be executing that heavyweight setup
everytime. To ammoritize this cost, just leak the test bo and the single
backing page we use for detecting userptr.
v2: Free the object and memory when bufmgr is destroyed.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Some compilers (like the Oracle Studio), require that the function
declaration must be annotated with the same visibility attribute as the
definition. As annotating functions with drm_public is no longer
required just remove the macro.
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Thierry Reding <treding@nvidia.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Running __mmu_notifier_register() is surprisingly expensive, so let's
not do that unless we have to.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Update kernel interface with new I915_GETPARAM ioctl entries for
subslice total and EU total. Add a wrapping function for each
parameter. Userspace drivers need these values when constructing
GPGPU commands. This kernel query method is intended to replace
the PCI ID-based tables that userspace drivers currently maintain.
The kernel driver can employ fuse register reads as needed to
ensure the most accurate determination of GT config attributes.
This first became important with Cherryview in which the config
could differ between devices with the same PCI ID.
The kernel detection of these values is device-specific. Userspace
drivers should continue to maintain ID-based tables for older
devices which return ENODEV when using this query.
v2: remove unnecessary include of <stdbool.h> and increment the
I915_GETPARAM indices to match updated kernel patch.
For: VIZ-4636
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
This reverts commit 080b4929b7.
Chris noticed that "negative values wait forever" is indeed intended
behaviour and the issue is just that we didn't have a testcase (fixed
now) and that a regression slipped through (fixed and on track for all
stable kernels).
So lets undo the documentation change for consistency, since working
around kernel regressions isn't good. Practical impact is nil anyway.
v2: Add a note to docs that some kernels have been broken.
v3: Remove the random garbage included by accident.
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
The kernel doesn't actually wait indefinately when passed a negative,
timeout, it returns immediately. Document this and suggest using INT64_MAX
for indefinite waits.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We really have to do this to avoid surprises when extending the ABI
later on. Especially when growing the structures.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
On recent emulator GTT entry setup for aub dump needs mem type as
GTT_ENTRY instead of NONLOCAL. NONLOCAL would write data in main
memory space which is wrong with new memory layout. GTT_ENTRY write
would setup GTT memory pool and other required internal buffers. With
this I can run aub dump on latest release without crash.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm_intel_gem_bo_free() crashes because the list bo_gem->vma_list is not
yet initialised, but the error path tries to free it.
See also https://bugs.freedesktop.org/show_bug.cgi?id=75844
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
When handling the error on GEM_CLOSE, we weren't freeing the allocated
page. Plug that.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
The changes make sure that members of the bufmgr_gem and bo_gem
name lists are sychronized between threads
when using the create from prime and create from name methods.
Signed-off-by: Rafal Sapala <rafal.a.sapala@intel.com>
Testcase: igt/drm_import_export
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>