This allows apps to destroy the window and renderer in either order, but
makes sure that the renderer can properly clean up its resources while OpenGL
contexts and libraries are still loaded, etc.
If the window is destroyed first, the renderer is (mostly) destroyed but its
pointer remains valid. Attempts to use the renderer will return an error,
but it can still be explicitly destroyed, at which time the struct is free'd.
If the renderer is destroyed first, everything works as before, and a new
renderer can still be created on the existing window.
Fixes#9540.
Previously, each backend would allocate and free the renderer struct. Now
the higher level does it, so the backends only manage their private resources.
This removes some boilerplate and avoids some potential accidents.
Headless display servers might not send an initial configure event, so don't assume that one has arrived and send garbage size/position values when showing a window.
Otherwise, it isn't immediately clear to the client application as to why its windows suddenly disappeared, and it received a quit event out of nowhere.
When running in a container, the underlying Pipewire version may not match the library version, so retrieve and check the core version info to see if it meets the preferred version requirements.
Checking for the pipewire-pulse service is unreliable when used in containers such as Flatpak, so simply use a minimum version check instead and prefer it over the Pulseaudio backend if at least version 1.0.0.
XWayland emulates the XRandR interface, so it still needs the actual mode switch call to trigger the mode switching emulation.
There is also no need to wait when using XWayland mode switching emulation, as it is handled via viewport scaling and thus instantaneous.
Otherwise, when you call SDL_CreateCondition() in something that can otherwise
survive in a single-threaded build, you'll get an error that seems fatal.
We already do this for mutexes and rwlocks (but not semaphores!)
Fixes#9558.
This allows using a much smaller (1.5 KB) lookup table, in exchange for a small amount of extra work per frame.
The extra work (a few extra loads/mul/adds) is negligible, and can execute in parallel.
The reduction in cache misses almost certainly outweighs any added cost.
The table is generated at runtime, and takes less than 0.02ms on my computer.
Compiler support for loading/storing multiple registers at once (i.e vld1q_f32_x4) seems very poor, so avoiding them for now.
Also switched to aligned stores with SSE. Although both SSE and NEON support unaligned stores, there is more likely to be a penalty to them, i.e when crossing a cache line. So might as align them.
This clears up confusion about whether to use SDL_KeyCode or SDL_Keycode and makes it clear that the values aren't the full set of possible keycodes.
Fixes https://github.com/libsdl-org/SDL/issues/9493