In 7d84809fdc I added a fast path for the
case-sensitive case, but it is still slowing down Compose parsing.
Instead of the binary search, use a perfect hash function, computed with
a simple python module I found (vendored).
It is faster -- perf diff is:
Baseline Delta Abs Shared Object Symbol
........ ......... ................. ...................................
22.35% -14.04% libc-2.33.so [.] __strcmp_avx2
16.75% +10.28% bench-compose [.] xkb_keysym_from_name
20.72% +2.40% bench-compose [.] parse.constprop.0
2.29% -1.97% bench-compose [.] strcmp@plt
2.56% +1.81% bench-compose [.] resolve_name
2.37% +0.92% libc-2.33.so [.] __GI_____strtoull_l_internal
26.19% -0.63% bench-compose [.] lex
1.45% +0.56% libc-2.33.so [.] __memchr_avx2
1.13% -0.31% libc-2.33.so [.] __strcpy_avx2
Also reduces the binary size:
Before:
text data bss dec hex filename
341111 5064 8 346183 54847 build/libxkbcommon.so.0.0.0
After:
text data bss dec hex filename
330215 5064 8 335287 51db7 build/libxkbcommon.so.0.0.0
Note however that it's still larger than before 7d84809fdccbb5898d08388:
text data bss dec hex filename
320617 5168 8 325793 4f8a1 build/libxkbcommon.so.0.0.0
Signed-off-by: Ran Benita <ran@unusedvar.com>
Previously we used a simple trie with a linked list for each chain.
Unfortunately most compose files have very long chains which means the
constructions performs an almost quadratic number of comparisons.
Switch to using a ternary search tree instead. This is very similar to a
trie, only the linked list is essentially replaced with a binary tree.
On the en_US/Compose file, the perf diff is the following (the modified
function is `parse`):
Event 'cycles:u'
Baseline Delta Abs Shared Object Symbol
........ ......... ................ .................................
39.91% -17.62% bench-compose [.] parse.constprop.0
20.54% +6.47% bench-compose [.] lex
17.28% +5.55% libc-2.33.so [.] __strcmp_avx2
12.78% +4.01% bench-compose [.] xkb_keysym_from_name
2.30% +0.83% libc-2.33.so [.] __GI_____strtoull_l_internal
3.36% +0.78% bench-compose [.] strcmp@plt
Thanks to some careful packing, the memory usage is pretty much the
same.
Signed-off-by: Ran Benita <ran@unusedvar.com>
I'd really like to use anonymous unions/structs. Supposedly even MSVC
supports it now. Let's try and see.
Signed-off-by: Ran Benita <ran@unusedvar.com>
Relatedly, strtoul allows a lot of unwanted stuff (spaces, +/- sign,
thousand seperators), we really ought not use it. But that's for another
time.
Signed-off-by: Ran Benita <ran@unusedvar.com>
Fits in uint16_t, which enables some future optimizations. But also a
good idea to have some limit. Not aware of any compose files which come
close.
Signed-off-by: Ran Benita <ran@unusedvar.com>
../../../src/xkbcomp/compat.c:693:16: warning: Although the value stored to
'merge' is used in the enclosing expression, the value is never actually read
from 'merge' [deadcode.DeadStores]
si.merge = merge = (def->merge == MERGE_DEFAULT ? merge : def->merge);
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
The target buffer is 7 bytes long, null-termination is optional (as the comment
already suggests). Coverity is unhappy about this though so let's use memset and
memcpy instead.
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
xkb_keysym_from_name() is called a lot in Compose file parsing. The
lower case handling slows things down a lot (particularly given we can't
use the optimized strcasecmp() due to locale issues). So add separate
handling for the non-case-sensitive case which is used by Compose.
To do this we need to add another version of the ks_tables table. This
adds ~20kb to the shared library binary. We can probably do something
better here but I think it's fine.
Signed-off-by: Ran Benita <ran@unusedvar.com>
Turns out FreeBSD supports evdev, so this toll can work on it; however
it does not support epoll, so switch to poll, which is portable.
Reported-by: Evgeniy Khramtsov <evgeniy@khramtsov.org>
Signed-off-by: Ran Benita <ran@unusedvar.com>
src/x11/keymap.c:980:26: warning: comparison of integer expressions of different signedness: ‘size_t’ {aka ‘long unsigned int’} and ‘int’ [-Wsign-compare]
980 | for (size_t i = 0; i < length; i++) {
| ^
Signed-off-by: Ran Benita <ran@unusedvar.com>
libxkbcommon-1.0.3/src/xkbcomp/ast-build.c:526: leaked_storage: Variable "file"
going out of scope leaks the storage it points to.
Where we exit the loop early, we don't release the various allocated memory.
Make this patch more obvious my moving the declaration for those into the loop
as well, this way we know that they aren't used outside the loop anywhere.
Found by coverity
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
If the name is missing in a configItem, we'd fail and leak the memory for
description, brief and vendor.
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
False positive because we rely on xkb_components_from_rules() to initalize this
struct, but let's localize the variable anyway to shut coverity up.
libxkbcommon-1.0.3/bench/rules.c:59:9: warning[-Wanalyzer-double-free]:
double-free of kccgst.symbols
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Both get_atom_name() and the new atom interner required a round trip. Move
get_atom_name() into the atom interner to save one more round trip. This brings
xkb_x11_keymap_new_from_device() down to two round trips, which is the minimum
possible number.
(Also, I think the new code in keymap.c is more readable than the mess I
previously created)
With this last commit in the series, this definitely:
Fixes: https://github.com/xkbcommon/libxkbcommon/pull/217
Signed-off-by: Uli Schlachter <psychon@znc.in>
There are a number of XKB requests needed to request all the information from
the X11 server. So far, the code was sending one request and waiting for the
reply. This commit starts batching the request so that we get multiple replies
with one round trip.
This removes three round trips.
Only the simple requests are converted. get_map() and get_names() use some
bitmasks that are needed for both the request and the reply. These will be dealt
with separately.
Signed-off-by: Uli Schlachter <psychon@znc.in>
Instead of asking for an atom name and waiting for the reply four times, this
now sends four GetAtomName requests and waits for all the replies at once. Thus,
this saves three round trips.
Signed-off-by: Uli Schlachter <psychon@znc.in>
On my system, calling xkb_x11_keymap_new_from_device() did 78 round trips to the
X11 server, which seems excessive. This commit brings this number down to about
9 to 10 round trips.
The existing functions adopt_atom() and adopt_atoms() guarantee that the atom
was adopted by the time they return. Thus, each call to these functions must do
a round-trip. However, none of the callers need this guarantee.
This commit makes "atom adopting" asynchronous: Only some time later is the atom
actually adopted. Until then, it is in some pending "limbo" state.
This actually fixes a TODO in the comments.
Fixes: https://github.com/xkbcommon/libxkbcommon/issues/216
Signed-off-by: Uli Schlachter <psychon@znc.in>
A simple script that creates a new layout with the given keysym replacing TLDE.
Then we compile a keymap and search for the keysym being assigned to TLDE and
bail if that fails.
The list of keysyms is manually maintained but we only need to add one or two to
spot-check whenever the xorgproto is updated.
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
As of xorgproto commit 5dbb5b76597f [1], the 0x10081XXX keycode range is defined
for direct evdev kernel keycode mapping. For example, KEY_MACRO1 (0x290) is
mapped to 0x10081290. The format of the #define lines for these keys is
stable to allow for parsing:
#define XF86XK_FooBar _EVDEVK(0x123) /* optional comment */
Update our script so we detect these new lines. Our keysym generation is a
two-step process: makeheader and then makekeys. Replacing the key with its full
value in the makeheader script means we don't have to update makekeys to handle
the _EVDEVK macro and our header file is fully resolved.
[1] https://gitlab.freedesktop.org/xorg/proto/xorgproto/-/merge_requests/23
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
rmlvos is the parent list which then fails during a list join because, well,
it's a list of lists.
Fixes#206
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Makes this test easier to run from the commandline. Where either of top_srcdir
or top_builddir isn't set, fill them in from the CWD or fail otherwise.
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>