The i915_vblank_swap() function schedules an automatic buffer swap
upon receipt of the vertical sync interrupt. Such an operation is
lengthy so it can't be allowed to happen in normal interrupt context,
thus the DRM implements this by scheduling the work in a kernel
softirq-scheduled tasklet. In order for the buffer swap to work
safely, the DRM's central lock must be taken, via a call to
drm_lock_take() located in drivers/char/drm/drm_irq.c within the
function drm_locked_tasklet_func(). The lock-taking logic uses a
non-interrupt-blocking spinlock to implement the manipulations needed
to take the lock. This semantic would be safe if all attempts to use
the spinlock only happen from process context. However this buffer
swap happens from softirq context which is really a form of interrupt
context. Thus we have an unsafe situation, in that
drm_locked_tasklet_func() can block on a spinlock already taken by a
thread in process context which will never get scheduled again because
of the blocked softirq tasklet. This wedges the kernel hard.
To trigger this bug, run a dual-head cloned mode configuration which
uses the i915 drm, then execute an opengl application which
synchronizes buffer swaps against the vertical sync interrupt. In my
testing, a lockup always results after running anywhere from 5 minutes
to an hour and a half. I believe dual-head is needed to really
trigger the problem because then the vertical sync interrupt handling
is no longer predictable (due to being interrupt-sourced from two
different heads running at different speeds). This raises the
probability of the tasklet trying to run while the userspace DRI is
doing things to the GPU (and manipulating the DRM lock).
The fix is to change the relevant spinlock semantics to be the
interrupt-blocking form. After this change I am no longer able to
trigger the lockup; the longest test run so far was 20 hours (test
stopped after that point).
Note: I have examined the places where this spinlock is being
employed; all are reasonably short bounded sequences and should be
suitable for interrupts being blocked without impacting overall kernel
interrupt response latency.
Signed-off-by: Mike Isely <isely@pobox.com>
Conflicts:
linux-core/drm_compat.c
linux-core/drm_compat.h
linux-core/drm_ttm.c
shared-core/i915_dma.c
Bump driver minor to 13 due to introduction of new
relocation type.
This patch fixes bits of the DRM so to make the radeon DRI work on
non-cache coherent PCI DMA variants of the PowerPC processors.
It moves the few places that needs change to wrappers to that
other architectures with similar issues can easily add their
own changes to those wrappers, at least until we have more useful
generic kernel API.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
PCI- or high memory.
This is substantially more efficient than drm_bo_kmap,
since the mapping only lives on a single processor.
Unmapping is done use kunmap_atomic(). Flushes only a single tlb() entry.
Add a support utility int drm_bo_pfn_prot() that returns the
pfn and desired page protection for a given bo offset.
This is all intended for relocations in bound TTMS or vram.
Mapping-accessing-unmapping must be atomic, either using preempt_xx() macros
or a spinlock.
PCI- or high memory.
This is substantially more efficient than drm_bo_kmap,
since the mapping only lives on a single processor.
Unmapping is done use kunmap_atomic(). Flushes only a single tlb() entry.
Add a support utility int drm_bo_pfn_prot() that returns the
pfn and desired page protection for a given bo offset.
This is all intended for relocations in bound TTMS or vram.
Mapping-accessing-unmapping must be atomic, either using preempt_xx() macros
or a spinlock.
This fixes at least two problems:
* The vblank_disable_fn timer callback could get called after the DRM was
de-initialized, e.g. on X server shutdown.
* Leak of vblank related resources when disabling and re-enabling the IRQ, e.g.
on an X server reset.
On many chipsets, the checks for DPLL enable or VGA mode will prevent the
pipeconf regs from being restored, which could result in a blank display or X
failing to come back after resume. So restore them unconditionally along with
actually restoring pipe B's palette correctly.
On resume, if the interrupt state isn't restored correctly, we may end
up with a flood of unexpected or ill-timed interrupts, which could cause
the kernel to disable the interrupt or vblank events to happen at the
wrong time. So save/restore them properly.
There were two problems with the existing callback code: the vblank
enable callback happened multiple times per disable, making drivers more
complex than they had to be, and there was a race between the final
decrement of the vblank usage counter and the next enable call, which
could have resulted in a put->schedule disable->get->enable->disable
sequence, which would be bad.
So add a new vblank_enabled array to track vblank enable on per-pipe
basis, and add a lock to protect it along with the refcount +
enable/disable calls to fix the race.
sequence number may actually turn up before the corresponding fence
object has been queued on the ring.
Fence drivers can use this member to determine whether a
sequence number must be re-reported.
In hibernate, we may end up calling the VGA save regs function twice, so we
need to make sure it's idempotent. That means leaving ARX in index mode after
the first save operation. Fixes hibernate on 965.
i915_flush_ttm was unconditionally executing a clflush instruction
to (obviously) flush the cache. Instead, check if the cpu supports
clflush, and if not, fall back to calling wbinvd to flush the entire
cache.
Signed-off-by: Kyle McMartin <kmcmartin@redhat.com>
(1 << bits) is an undefined value when bits == 32.
gcc may generate 1 with this expression
which will lead to an infinite retry loop in
drm_ht_just_insert_please.
Because of the different implement of hash_long,
this issue is more frequenly see on 64 bit system
As DRM_DEBUG macro already prints out the __FUNCTION__ string (see
drivers/char/drm/drmP.h), it is not worth doing this again. At some
other places the ending "\n" was added.
airlied:- I cleaned up a few that this patch missed also
I couldn't figure out what drm_bo_type_dc was for; Dave Airlie finally clued
me in that it was the 'normal' buffer objects with kernel allocated pages
that could be mmapped from the drm device file.
I thought that 'drm_bo_type_device' was a more descriptive name.
I also added a bunch of comments describing the use of the type enum values and
the functions that use them.
Flags pending validation were stored in a misleadingly named field, 'mask'.
As 'mask' is already used to indicate pieces of a flags field which are
changing, it seems better to use a name reflecting the actual purpose of
this field. I chose 'proposed_flags' as they may not actually end up in
'flags', and in an case will be modified when they are moved over.
This affects the API, but not ABI of the user-mode interface.
Previously, dummy_read_page was used only for read-only user allocations; it
filled in pages that were not present in the user address map (presumably,
these were allocated but never written to pages).
This patch allows them to be used for read-only ttms allocated from the
kernel, so that applications can over-allocate buffers without forcing every
page to be allocated.
I'm hoping to use the dummy_read_page for kernel allocated buffers to avoid
allocating extra pages for read-only buffers (like vertex and batch buffers).
This also eliminates the 'write' parameter to drm_ttm_set_user and just
has DRM_TTM_PAGE_WRITE passed into drm_ttm_create.
Aside from changing drm_bind_ttm to drm_ttm_bind, this patch
adds only documentation and fixes the functions inside drm_ttm.c
to all be prefixed with drm_ttm_.
drmBOSetStatus does not bother to set the fence_class parameter.
Fortunately, drm_bo_setstatus_ioctl doesn't end up using it as it
calls drm_bo_handle_validate with use_old_fence_class = 1.
Document parameters and usage for drm_bo_handle_validate. Change parameter
order to match drm_bo_do_validate (fence_class has been moved to after
flags, hint and mask values). Existing users of this function have been
changed, but out-of-tree users must be modified separately.
Add comments about the parameters to drm_bo_do_validate, along
with comments for the DRM_BO_HINT options. Remove the 'do_wait'
parameter as it is duplicated by DRM_BO_HINT_DONT_BLOCK.
Creating a ttm was done with drm_ttm_init while destruction was done with
drm_destroy_ttm. Renaming these to drm_ttm_create and drm_ttm_destroy makes
their use clearer. Passing page_flags to the create function will allow that
to know whether user or kernel pages are needed, with the goal of allowing
kernel ttms to be saved for later reuse.
This header file is shared across linux and bsd, but is not installed
for user space to access. It's the place to put prototypes and data
types that aren't platform or chipset specific, but still internal to
the drm.