The DRM_XGI_PCIE_ALLOC and DRM_XGI_FB_ALLOC ioctls (and the matching
free ioctls) are unified to DRM_XGI_ALLOC. The desired memory region
is selected by xgi_mem_alloc::location. The region is magically
encoded in xgi_mem_alloc::index, which is used to release the memory.
Bump to version 0.11.0. This update requires a new DDX.
Need to find another way of doing this, ideally someone'd hunt down which
object/method controls it! The Xv blit adaptor is likely now broken on
cards that have pNv->WaitVSyncPossible enabled.
1. DRM_NOUVEAU_GPUOBJ_FREE
Used to free GPU objects. The obvious usage case is for Gr objects,
but notifiers can also be destroyed in the same way.
GPU objects gain a destructor method and private data fields with
this change, so other specialised cases (like notifiers) can be
implemented on top of gpuobjs.
2. DRM_NOUVEAU_CHANNEL_FREE
3. DRM_NOUVEAU_CARD_INIT
Ideally we'd do init during module load, but this isn't currently
possible. Doing init during firstopen() is bad as X has a love of
opening/closing the DRM many times during startup. Once the
modesetting-101 branch is merged this can go away.
IRQs are enabled in nouveau_card_init() now, rather than having the
X server call drmCtlInstHandler(). We'll need this for when we give
the kernel module its own channel.
4. DRM_NOUVEAU_GETPARAM
Add CHIPSET_ID value, which will return the chipset id derived
from NV_PMC_BOOT_0.
4. Use list_* in a few places, rather than home-brewed stuff.
The core DRM lastclose routine automatically destroys all mappings and
releases SG memory. XP10 DRM and DDX assumed this data stayed around
until module unload. xgi_bootstrap was reworked to recreate all these
mappings. In addition, the drm_addmap for the GART backing store was
moved into the kernel. This causes a change to the ioctl protocol and
a version bump.
There were numerous unnecessary fields in xgi_cmd_info. The remaining
fields had pretty crummy names. Cut out the cruft, and rename the
rest. As a result, the unused parameter "triggerCounter" to
triggerHWCommandList can be removed.
The data is now in kernel space, copied in/out as appropriate according to the
This results in DRM_COPY_{TO,FROM}_USER going away, and error paths to deal
with those failures. This also means that XFree86 4.2.0 support for i810 DRM
is lost.
As a fallout, replace filp storage with file_priv storage for "unique
identifier of a client" all over the DRM. There is a 1:1 mapping, so this
should be a noop. This could be a minor performance improvement, as everything
on Linux dereferenced filp to get file_priv anyway, while only the mmap ioctls
went the other direction.
This was used to make all ioctl handlers return -errno on linux and errno on
*BSD. Instead, just return -errno in shared code, and flip sign on return from
shared code to *BSD code.
Following my nv28 kmmio dumps, nouveau_wait_for_idle() is modified to
read PTIMER and NV03_PMC_ENABLE. Also a timeout based on PTIMER value is
added, so wait_for_idle() cannot stall indefinitely (unless PTIMER is
halted). The timeout was selected as 1 giga-ticks, which for me is 1s.
NV03_PGRAPH_NSTATUS and NV03_PGRAPH_NSOURCE.
The prefix NV03 is chosen because nv10reg.h had no versioned prefix,
and the code using these registers does not check card_type.
The ioctlss XGI_ESC_DEVICE_INFO, XGI_ESC_MEM_COLLECT,
XGI_ESC_PCIE_CHECK, XGI_ESC_GET_SCREEN_INFO, XGI_ESC_PUT_SCREEN_INFO,
XGI_ESC_MMIO_INFO, and XGI_ESC_SAREA_INFO, are completely unnecessary.
The will be doubly useless when the driver is converted to the DRM
infrastructure.
Allows multiple references to a single object, needed to support PCI(E)GART
scatter-gather DMA objects which would quickly fill PRAMIN if each channel
had its own.
Handle per-channel private instmem areas. This is needed to support NV50,
but might be something we want to do on earlier chipsets at some point?
Everything that touches PRAMIN is a GPU object.
For various reasons, this ioctl was a bad idea.
At channel creation we now automatically create DMA objects covering
available VRAM and GART memory, where the client used to do this themselves.
However, there is still a need to be able to create DMA objects pointing at
specific areas of memory (ie. notifiers). Each channel is now allocated a
small amount of memory from which a client can suballocate things (such as
notifiers), and have a DMA object created which covers the suballocated area.
The NOTIFIER_ALLOC ioctl exposes this functionality.
NV04/NV10 load_context()/save_context() are stubs. I don't know enough about
how they work to implement them sanely. The "old" context_switch() code
remains hooked up, so it shouldn't break anything.
NV20 will probably break if load_context() works. No inital context values
are filled in, so when the first channel is created PGRAPH will probably end
up having its state zeroed. Some setup from nv20_graph_init() will probably
need to be moved to the per-channel context setup.
- use a timer for disabling vblank events to avoid enable/disable calls too
often
- make i915 work with pre-965 chips again (would like to structure this
better, but this hack works on my test system)
s/u64/drm_u64_t/ to allow userspace code using drm.h to compile.
Move 64 bit arg member to the beginning to avoid alignment issues with 32
bit userspace on 64 bit kernels.
It's possible that we disable vblank interrupts and clear the
corresponding flag in irq_enable_reg, but receive an interrupt at just
the wrong time, causing us to not ack it properly, nor report to the
core kernel that it was handled. Fix that case by always handling
vblank interrupts, even if the irq_enable_reg field is clear.
Fix range of frame counter registers.
Use DRM_ERR() instead of Linux specific error codes in shared code.
Remove duplicate register definitions and superfluous local variables.
- use correct refcount variable in get/put routines
- extract counter update from drm_vblank_get
- make signal handling callback per-crtc
- update interrupt handling logic, drivers should use drm_handle_vblank
- move wakeup and counter update logic to new drm_handle_vblank routine
- fixup usage of get/put in light of counter update extraction
- fix longstanding bug in signal code, update pending counter only
*after* we're sure we'll setup signal handling
Introduce tile members for future tiled buffer support.
Allow user-space to explicitly define a fence-class.
Remove the implicit fence-class mechanism.
64-bit wide buffer object flag member.
These require that the status page be referenced by a pointer in GTT, rather
than phsyical memory. So, we have the X Server allocate that memory and tell
us the address, instead.
i915_driver_irq_postinstall was forcing vblank interrupts to pipe A when
called with vblank interrupts disabled. This caused vblank interrupts to be
accidentally re-enabled when VT switching the X server. Instead, start the
driver with vblank interrupts enabled on pipe A to support older X servers,
but then leave control over the state to the X server if it is able to do so.
This just cleans up the xf86drm.c to what I want and drm.h,
I need to fix up the kernel internals to suit these changes now.
I've moved to using struct instead of typedefs for the bo and it doesn't look
that bad so I'll do the same thing for mm and fence..
There can still be other contexts that may use page flipping later on, so don't
just unilaterally 'clean it up', which could lead to the wrong page being
displayed, e.g. when running 3D apps with a GLX compositing manager such as
compiz using page flipping.
The IGPGART setup code was traced using mmio-trace on fglrx by myself
and Phillip Ezolt <phillipezolt@gmail.com> on dri-devel.
This code doesn't let the 3D driver work properly as the card has no
vertex shader support.
Thanks to Matthew Garrett + Ubuntu for providing me some hardware to do this
work on.
It's a good idea to keep these synchronized; even though the DRM doesn't use all
the defines, maintaining two different copies is prone to errors when the diff
gets bigger.
The PGRAPH init for the various cards will need cleaning up at some point,
a lot of the values written there are per-context state left over from the
all the hardcoding done in the ddx.
It's possible some cards get broken by this commit, let me know.
Tested on: NV5, NV18, NV28, NV35, NV40, NV4E
Note that just like the values written to R300_RB3D_DSTCACHE_CTLSTAT these
values are really unknown; ideally more reverse engineering should be done to
determine what these values mean and when they should be set.
graphics objects:
- No longer takes flags/dmaobj parameters, requires some major changes
to the ddx to setup the object through the FIFO. This change is
likely to cause breakages on some cards (tested on NV05,NV28,NV35,
NV40 and NV4E).
dma objects:
- now takes a "class" parameter, not really used yet but we may need
it at some point.
- parameters are checked, so clients can't randomly create DMA objects
pointing at whatever they feel like.
misc:
- Added FB_SIZE/AGP_SIZE getparams
- Read PFIFO_INTR in PFIFO irq handler, not PMC_INTR
- Dump PGRAPH trap info on PGRAPH_INTR_NOTIFY if NSOURCE isn't
NOTIFICATION_PENDING.
Add refcounting of user waiters to the DRM hardware lock, so that we can use the
DRM_LOCK_CONT flag more conservatively.
Also add a kernel waiter refcount that if nonzero transfers the lock for the kernel context,
when it is released. This is useful when waiting for idle and can be used
for very simple fence object driver implementations for the new memory manager.
It also resolves the AIGLX startup deadlock for the sis and the via drivers.
i810, i830 still require that the hardware lock is really taken so the deadlock remains
for those two. I'm not sure about ffb. Anyone familiar with that code?
The MI_WAIT_FOR_EVENT instruction does not support waiting for several events
at once, so this should fix the lockups with page flipping when both pipes are
enabled.
Always use dev_priv->sarea_priv->pf_current_page directly. This allows clients
to modify it as well while they hold the HW lock, e.g. in order to sync pages
between pipes.
The assumption is that synchronous flips are not isolated usually, and waiting
for all of them could result in stalling the pipeline for long periods of time.
Also use i915_emit_mi_flush() instead of an old-fashioned way to achieve the
same effect.
Unfortunately, emitting asynchronous flips during vertical blank results in
tearing. So we have to wait for the previous vertical blank and emit a
synchronous flip.
Leave it to the client to wait for the flip to complete when necessary,
but wait for a previous flip to complete before emitting another one. This
should help avoid unnecessary stalling of the ring due to pending flips.
Call i915_do_cleanup_pageflip() unconditionally in preclose.
Memory types are either fixed (on-card or pre-bound AGP) or not fixed
(dynamically bound) to an aperture. They also carry information about:
1) Whether they can be mapped cached.
2) Whether they are at all mappable.
3) Whether they need an ioremap to be accessible from kernel space.
In this way VRAM memory and, for example, pre-bound AGP appear
identical to the memory manager.
This also makes support for unmappable VRAM simple to implement.
Hook into nv20 pgraph switching functions (they're identical for nv3x).
Actually call nv30_pgraph_context_init so the ctx_table is allocated.
Thanks to Carlos Martin for the help.
This means the loop will wait up to ~10ms for ring buffer space to become
available, rather than just however long it takes to check the space 10000
times. This matches other drivers' behavior when waiting for ring buffer/fifo
space.
* Pulled in some registers from nv10reg.h. Needed for context switching.
* Filled in nv30 graphics context (based on nv40_graph.c).
* Figure out nv30 context table, set up on context creation. Allows the cards automatic switching to work.
The overflows could lead to the AGP aperture overlapping the framebuffer area
in the card's address space when the latter is located at the very end of the
32 bit address space, which would result in a freeze on X server startup,
probably because the card read commands from the framebuffer instead of from
AGP.
See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=392915 .
Now I can get 3D + working grctx switching on my NV40 without
the binary driver initialising the card first. However, this
change also breaks 3D on my C51 even *with* the binary driver's
help. So, it's likely that the weird voodoo is card-specific.
This is enough to get grctx switching going on my NV40 and C51 after
the binary driver has initialised the card first.
Bumping the drm patchlevel because the ddx needs some modifications to
have NV4x work at all with these changes.
With this change the GPU is responsible for doing the channel switch
itself. This is needed for the upcoming NV4x PGRAPH context work as
we don't yet know enough to manually swap PGRAPH contexts.
Replace r300_check_offset() with generic radeon_check_offset(), which doesn't
reject valid offsets when the framebuffer area is at the very end of the card's
32 bit address space. Make radeon_check_and_fixup_offset() use
radeon_check_offset() as well.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=7697 .
The current version didn't build on BSD, where the new functionality isn't used
yet anyway. Whoever changes that will hopefully be able to make the OSes share
this file as well.
Previously, if there were several buffer swaps scheduled for the same vertical
blank, all but the first blit emitted stood a chance of exhibiting tearing. In
order to avoid this, split the blits along slices of each output top to bottom.
- Do important card init in firstopen
- Give each channel it's own cmdbuf dma object
- Move RAMHT config state to the same place as RAMRO/RAMFC
- Make sure instance mem for objects is *after* RAM{FC,HT,RO}
On X init, PFIFO and PGRAPH are reset to defaults. This causes the GPU to
loose the configuration done by the drm. Perhaps a CARD_INIT ioctl a proper
solution to having this problem again in the future..
This will come in very handy for tiled buffers on intel hardware.
Also add some padding to interface structures to allow future binary backwards
compatible changes.
Ok, I lied before.. it was a fluke it worked and required magic to repeat it..
It actually helps to fill in RAMFC entries in the correct place.
The code also clears RAMIN entirely instead of just the hash-table.
Fix buffer bound caching policy changing, Allow
on-the-fly changing of caching policy on bound buffers if the hardware
supports it.
Allow drivers to use driver-specific AGP memory types for TTM AGP pages.
Will make AGP drivers much easier to migrate.
Adapt for new functions in the 2.6.19 kernel.
Remove the ability to have multiple regions in one TTM.
This simplifies a lot of code.
Remove the ability to access TTMs from user space.
We don't need it anymore without ttm regions.
Don't change caching policy for evicted buffers. Instead change it only
when the buffer is accessed by the CPU (on the first page fault).
This tremendously speeds up eviction rates.
Current code is safe for kernels <= 2.6.14.
Should also be OK with 2.6.19 and above.
Added preliminary support for context switches (triggers the interrupts, but hangs after the switch ; something's not quite right yet).
Removed the PFIFO_REINIT ioctl. I hope it's that a good idea...
Requires the upcoming commit to the DDX.
mach64_state.c: convert the DRM_MACH64_BLIT ioctl to submit a pointer to
user-space memory rather than a DMA buffer index, similar to DRM_MACH64_VERTEX.
This change allows the DDX to map the DMA buffers read-only and eliminate a
security problem where a client can alter the contents of the DMA buffer after
submission to the DRM.
This change also affects the DRI/DRM interface. Performace-wise, it basically
affects PCI mode where I get a ~12% speedup for some Mesa demos I tested.
This is mainly due to eliminating an ioctl for allocating the DMA buffer.
mach64_dma.c: move the responsibility for allocating memory for the DMA ring
in PCI mode to the DDX.
This change affects the DDX/DRM interface and unifies a couple of PCI/AGP code
paths for ring memory in the DRM.
Bump the mach64 DRM version major and date.
Map the DMA buffers from the same linear address as the vertex bufs. If
dev->agp_buffer_token is not set, mach64 drm maps the DMA buffers from
linear address 0x0.