This makes our handling of cliprects sane. drm_clip_rect always has exclusive
bottom-right corners, but the hardware expects inclusive bottom-right corners,
so we adjust this here.
This complements Michel Daenzer's commit 57aea290e1e0a26d1e74df6cff777eb9f038f1f8
to Mesa. See also http://bugs.freedesktop.org/show_bug.cgi?id=16123 .
This is around 3x or so speedup, since we would read wide rows at a time, and
clflush each tile 8 times as a result. We'll want code related to this anyway
when we do fault-based per-page clflushing for sw fallbacks.
DRAW_INDEX writes a vertex count to VAP_VF_CNTL. Docs say that behaviour
is undefined (i.e. lockups happen) when this write is not followed by the
right number of vertex indices.
Thus we used to do the wrong thing when drawing across many cliprects was
necessary, because we emitted a sequence
DRAW_INDEX, DRAW_INDEX, INDX_BUFFER, INDX_BUFFER
instead of
DRAW_INDEX, INDX_BUFFER, DRAW_INDEX, INDX_BUFFER
The latter is what we're doing now and which ought to be correct.
This is an initial import of the atom bios parser with modesetting support
for r500 hw using atombios. It also includes a simple memory manager
layer that translates a radeon GEM style interface onto TTM internally.
So far this memory manager has only been used for pinned object allocation
for the DDX to test modesetting.
This resolves a panic on FreeBSD which was caused by trying
to re-initialize the swap lock. It's just much easier to
initialize all of the locks at load time. It should also
ensure that the vblank structures are available earlier.
Thanks to the reworked vblank-rework, we can just use the hardware frame
counter directly, and make the RADEON_PARAM_VBLANK_CRTC getparam just return
what was set by the corresponding setparam.
We depend on the VM fully now for memory protection, separate DMA objects
for VRAM and GART are unneccesary. However, until the next interface break
(soon) a client can't depend on the objects being the same and must still
call NV_OBJ_SET_DMA_* methods appropriately.
Remove the unused (and broken) "in vblank" code now that the core has
been fixed to use a counter while interrupts are enabled. Also make the
vblank pipe get/set ioctls into dumb stub functions, since with the new
code we can no longer let userspace control whether vblank interrupts
are enabled, or the core code will misbehave.
modifications to make it work correctly on my test hardware (altered the
backlight write function, made it enable the legacy backlight controller
interrupts on mobile hardware, sorted the interrupt function so we don't
get an excessive number of vblank interrupts). This lets the backlight
keys on my T61 work properly, though there's a 750msec or so delay
between the request and the brightness actually changing - this sounds
awfully like the hardware spinning waiting for a status flag to become
ready, but as far as I can tell they're all set correctly. If anyone can
figure out what's wrong here, it'd be nice to know.
Some of the functions are still stubs and just tell the hardware that
the request was successful. These can be filled in as kernel modesetting
gets integrated. I think it's worth getting this in anyway, since it's
required for backlight control to work properly on some new platforms.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Various chips have exciting interactions between the CPU and the GPU's
different ways of accessing interleaved memory, so we need some kernel
assistance in determining how it works.
Only fully tested on GM965 so far.
- There is one fb, used for as many outputs as possible.
- Eventually smaller screens will be scaled to see the full console, but for the moment this'll do.
When scheduled swaps occur, we need to blit between front & back buffers. I
the buffers are tiled, we need to set the appropriate XY_SRC_COPY tile bit,
only on 965 chips, since it will cause corruption on pre-965 (e.g. 945).
Bug reported by and fix tested by Tomas Janousek <tomi@nomi.cz>.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We won't get a PFIFO context switch when the same channel ID is recreated if
the hw still thinks the channel is already active, which causes fun issues.
Should allow X to be stopped and started without tearing down the entire
card state in lastclose().
This avoids seeing garbage from engine setup etc before X gets around
to pointing the CRTCs at a new scanout buffer. Not actually a noticable
problem before G80 as PRAMIN is forced to the end of VRAM by the hardware
already.
We won't get a PFIFO context switch when the same channel ID is recreated if
the hw still thinks the channel is already active, which causes fun issues.
Should allow X to be stopped and started without tearing down the entire
card state in lastclose().
With the interrupt enable/disable using only the mask register, it was wrong
to use the enable register to detect which pipes had vblank detection
turned on. Also, as we keep a local copy of the mask register around, and
MSI machines smack the hardware during the interrupt handler, it is more
efficient and more correct to use the local copy.
It would be nice if one day the DRM driver was the canonical source for
register definitions and core macros. To that end, this patch cleans things up
quite a bit, removing redundant definitions (some with different names
referring to the same register) and generally tidying up the header file.
Clean up queues, free objects. On the next entervt, unmark the hardware to
let the user try again (presumably after resetting the chip). Someday we'll
automatically recover...
We depend on the VM fully now for memory protection, separate DMA objects
for VRAM and GART are unneccesary. However, until the next interface break
(soon) a client can't depend on the objects being the same and must still
call NV_OBJ_SET_DMA_* methods appropriately.
This avoids seeing garbage from engine setup etc before X gets around
to pointing the CRTCs at a new scanout buffer. Not actually a noticable
problem before G80 as PRAMIN is forced to the end of VRAM by the hardware
already.
Noting that the interrupt mask register was more reliable than the interrupt
enable register for managing interrupts in user_irq_on/user_irq_off, this
patch replaces the remaining IER frobbing with IMR instead.
The test which exposes IER related failures is:
$ glxgears & glxgears & glxgears
(reposition the glxgears windows away from the upper left corner)
$ while :; do x11perf -rect100 -reps 800 -repeat 1; sleep 1; done &
$ while :; do runoa; runet; done &
find_or_create_page doesn't quite set up pages correctly; any newly created
pages aren't hooked into the shmem object quite right; user space mmaps of
those pages end up mapping pages full of zeros which then get written to the
real pages inappropriately. This patch requires that the kernel export
shmem_getpage.
When a software fallback has completed, usermode must notify the kernel so
that any scanout buffers can be synchronized. This ioctl should be called
whenever a fallback completes to flush CPU and chipset caches.
Lots of conflicts, seems to load ok, but I'm sure some bugs snuck in.
Conflicts:
linux-core/drmP.h
linux-core/drm_lock.c
linux-core/i915_gem.c
shared-core/drm.h
shared-core/i915_dma.c
shared-core/i915_drv.h
shared-core/i915_irq.c
This fixes registration when MSI is set up after the stub function fills in
dev->irq. Otherwise /proc/interrupts would report attachment to the fasteoi
interrupt. dev->irq is still exposed (and updated at IRQ setup)
for the drivers that use it for whatever reason.
Record the last execbuffer sequence for each client.
Record that sequence in the throttle ioctl as the 'throttle sequence'.
Wait for the last throttle sequence in the throttle ioctl.
The interrupt identity register must be writen before any work occurs lest
we drop an interrupt on the floor. This patch just shuffles code around to
make sure that IIR is written as early as possible.
We want request retirement to occur about once a second when the request
queue is non-empty. This was done with a timer that queued a work_struct,
using a delayed_work instead makes a lot more sense.
In the short-circuit code for the breadcrumb already being new enough, we
need to update the sarea_priv copy of the breadcrumb just as if we had
waited. Otherwise userland error checking will notice that we returned
too early based on its wrong information, and call wait_irq again (leading
to spinning until someone else comes along and updates the sarea_priv).
This bug was hidden when we had interrupt masking disabled, such as in
master, since the interrupt handler would update sarea_priv.
This was insufficient once we started masking interrupts to only when someone
was waiting for them (and would thus retire requests themselves). It was
replaced by the retire_timer.
This patch should fixe hard lockup and convert them in
softlockup (ie you can ssh the box but the gpu is busted
and we are waiting in loop for it to come back to reason).
Use GEM for ring buffer setup and framebuffer allocation. This means reworking
the hardware status page stuff a bit (just use the basic range allocator for
vram for now) and #ifdef'ing out the TTM & DRI2 code. Works well enough to
load/unload several times and display fbcon on my T61 (though there's still
some unexplained console corruption).
This is the create (may want location flags), pread/pwrite/mmap
(performance tuning hints), and set_domain (will 32 bits be enough for
everyone?) ioctls. Left in the generic set are just flink/open/close.
The 2D driver must be updated for this change, and API but not ABI is broken
for 3D. The driver version is bumped to mark this.
Use new GEM based ring buffer initialization. Still need to init GEM & use it
for framebuffer allocation etc.
Conflicts:
shared-core/i915_dma.c
shared-core/i915_drv.h
This requires that the X Server use the execbuf interface for buffer
submission, as it no longer has direct access to the ring. This is
therefore a flag day for the gem interface.
This also adds enter/leavevt ioctls for use by the X Server. These would
get stubbed out in a modesetting implementation, but are required while
in an environment where the device's state is only managed by the DRM while
X has the VT.
Without the user IRQ running constantly, there's no wakeup when the ring
empties to go retire requests and free buffers. Use a 1 second timer to make
that happen more often.
Instead of throttling and execbuffer time, have the application ask to
throttle explicitly. This allows the throttle to happen less often, and
without holding the DRM lock.
Idea being if you want to add new crtc/output/encoder dynamically later,
you just increase the generation counter and userspace should re-read
all the resources
Okay we have crtc, encoder and connectors.
No more outputs exposed beyond driver internals
I've broken intel tv connector stuff.
Really for TV we should have one TV connector, with a sub property for the
type of signal been driven over it
Use subclassing from the drivers to allocate the objects. This saves
two objects being allocated for each crtc/output and generally makes
exit paths cleaner.
This splits a lot of the core modesetting code out into a file of
helper functions, that are only called from themselves and/or the driver.
The driver gets called into more often or can call these functions from itself
if it is a helper using driver.
I've broken framebuffer resize doing this but I didn't like the API for that
in any case.
The interrupt enable register cannot be used to temporarily disable
interrupts, instead use the interrupt mask register.
Note that this change means that a pile of buffers will be left stuck on the
chip as the final interrupts will not be recognized to come and drain things.
Add code to get panel modes from the VBIOS if present and check whether certain
outputs exist. Should make our display detection code a little more robust.
Recording the tail pointer in a local variable improves performance, but if
someone messes up and fails to reload at the right time, the driver will
write commands to the wrong part of the ring and scramble execution badly.
This change (available by setting I915_RING_VALIDATE to 1) checks to make
sure the cached tail pointer matches the hardware tail pointer at each ring
buffer addition, calling BUG_ON when that's not true.
There are now 3 lists. Active is buffers currently in the ringbuffer.
Flushing is not in the ringbuffer, but needs a flush before unbinding.
Inactive is as before. This prevents object_free → unbind →
wait_rendering → object_reference and a kernel oops about weird refcounting.
This also avoids an synchronous extra flush and wait when freeing a buffer
which had a write_domain set (such as a temporary rendered to and then from
using the 2d engine). It will sit around on the flushing list until the
appropriate flush gets emitted, or we need the GTT space for another
operation.
This lets us get some qualities we desire, such as using the full 32-bit
range (except zero), avoiding DRM_WAIT_ON, and a 1:1 mapping of active
sequence numbers to request structs, which will be used soon for throttling
and interrupt-driven list cleanup.
Additionally, a boolean active field is added to indicate which list an
object is on, rather than smashing last_rendering_cookie to 0 to show
inactive. This will help with flush-reduction later on, and makes the code
clearer.
It would be nice if one day the DRM driver was the canonical source for
register definitions and core macros. To that end, this patch cleans
things up quite a bit, removing redundant definitions (some with
different names referring to the same register) and generally tidying up
the header file.
In order to avoid recursive ->detect->interrupt->detect->interrupt->...
we need to disable TV hotplug interrupts in
intel_tv.c:intel_tv_detect_type. We also need to enable the TV interrupt
detection and hotplug sequence properly in i915_irq.c.
drm_crtc->fb may point to NULL, f.e X server will allocate a new fb
and assign it to the CRTC at startup, when X server exits, it will destroy
the allocated fb, making drm_crtc->fb points to NULL.