Commit Graph

1417 Commits (5270a5532470564168affb04d5749de65efc3d05)

Author SHA1 Message Date
Mikhail Gusarov d21645be08 xkbcomp: Require newline after !include line in rules files
Rules file parser allows constructs like

!include "foo" !include "bar" !layout = symbols

This is most likely an oversight in original code.

Closes #452
2024-02-20 11:27:02 +10:00
Pierre Le Marre ed2dc978b1 keysyms: Update using latest xorgproto
This fixes a typo and improves comments.

xorgproto commit: cd33097fc779f280925c6d6bbfbd5150f93ca5bc
Relevant MR: https://gitlab.freedesktop.org/xorg/proto/xorgproto/-/merge_requests/84
2024-02-08 14:24:28 +01:00
Pierre Le Marre 1731c6b3ef compose: Ensure we mmap only regular files
Currently we do not check that the Compose files we try successively are
*regular* files. This may result in an error, while we should just really
just skip the corresponding path.

Fixed by adding the new utily function `open_file`.
2024-02-07 09:24:59 +01:00
Pierre Le Marre 382f6d2d5e Keysyms: Update using latest xorgproto
For the sake of compatibility, this reintroduce some deleted keysyms and
postpone the effective deprecation of others.

xorgproto commit: fe12c5102762afcbf852e50dcbbdea2ef625570c

Also added tests for some canonical names.
2024-02-06 17:45:02 +01:00
Yuichiro Hanada efdb05d193 parser: Do now allow the empty symbol declaration
An empty element is allowed in SymbolsBody definition, so the following
keymap is gramatically correct.

```
xkb_keymap {
  ...
  xkb_symbols "sym" {
    key <SPC> {, [Space] };
  };
};
```

However, the current parser crashes with the keymap due to null pointer
access.
This change fixes it by changing the parser not to allow it.
2024-02-05 09:42:02 +01:00
Pierre Le Marre 43c9752d44 compose: Fix iterator for empty tables
The current `xkb_compose_table_iterator_next` segfaults when used with an
empty table. Indeed, in this case we initialize cursors in
`xkb_compose_table_iterator_new` with the dummy node and the direction
`NODE_LEFT`, but the dummy node is a leaf!

Fixed by initializing with no cursors when the table is has no non-dummy
nodes.
2024-01-16 11:34:28 +01:00
Pierre Le Marre 1034f272e4 compose: Reduce xkb_compose_state_get_utf8 buffer
`xkb_keysym_to_utf8` does not return a keysym name but its corresponding
character, if defined.
2024-01-12 10:58:38 +01:00
Pierre Le Marre 8cca3a7bfb compose: Add XKB_COMPOSE_MAX_STRING_SIZE
Define the maximum size of a compose sequence result string explicit as
a constant and use it everywhere to improve the code readability.
2024-01-12 10:58:38 +01:00
Pierre Le Marre 5c7c79970a keysyms: Fix case mapping for 3 Latin 1 keysyms
Fixed the upper case mappings for:
- XKB_KEY_ydiaeresis
- XKB_KEY_mu
- XKB_KEY_ssharp

Note: In Unicode, the upper case of “ß” (U+00DF) is *not* “ẞ” (U+1E9E)
but “SS”. “ẞ” is reserved for text in capitals.
2024-01-10 15:03:22 +01:00
Pierre Le Marre b5a140832b keysyms: Fix xkb_keysym_is_modifier
Currently `xkb_keysym_is_modifier` does not detect the following keysyms:
- `XKB_KEY_ISO_Level5_Shift`
- `XKB_KEY_ISO_Level5_Latch`
- `XKB_KEY_ISO_Level5_Lock`

Indeed, there is a mistake in the keysym interval that the code checks.
The reason seems a confusing order of the keysyms in
`xkbcommon-keysyms.h`: the current code has a comment “libX11 only goes
up to XKB_KEY_ISO_Level5_Lock”, but in fact the modifiers keysyms are
listed in a _semantic_ order in `xkbcommon-keysyms.h`, not in the
increasing keysym _value_ order.

Fixed by using the same (correct) code as libX11 and added some tests.
2024-01-10 14:43:29 +01:00
Pierre Le Marre 0074baf451 keysyms: Add XKB_KEYSYM_NAME_MAX_SIZE for internal use
Currently there is no indication of the maximum length of keysym names.

This is statically known, so add the new *internal* following API:
`XKB_KEYSYM_NAME_MAX_SIZE`.
2024-01-10 14:18:43 +01:00
Pierre Le Marre 817179d866 keysyms: Add xkb_keysym_iterator
Add an efficient way to iterate over the assigned keysyms.

Currently only provided for testing, so we guard it by
`ENABLE_PRIVATE_APIS` in order to reduce the installed library.
2024-01-10 14:18:43 +01:00
Pierre Le Marre 4f52d6068e keysyms: Add xkb_keysym_is_assigned
Add internal API `xkb_keysym_is_assigned` for tests, guarded by
`ENABLE_PRIVATE_APIS` in order to avoid increasing the size of the installed library.
2024-01-10 14:18:43 +01:00
Pierre Le Marre 82f138c633 keysyms: Add min and max assigned keysyms internal API
Currently there is no direct way to know the minimum and maximum keysym
values that are assigned, i.e. that have an explicit name or are
Unicode keysyms.

Introduce the new following internal API:
- XKB_KEYSYM_MIN_ASSIGNED
- XKB_KEYSYM_MAX_ASSIGNED
- XKB_KEYSYM_MIN_EXPLICIT
- XKB_KEYSYM_MAX_EXPLICIT
- XKB_KEYSYM_COUNT_EXPLICIT

Also add a bunch of tests to ensure consistant keysyms bounds.
2024-01-10 14:18:43 +01:00
Pierre Le Marre 4a92f61b5c keysyms: Add Unicode constants
Add the following constants in order to improve the code readability:
- XKB_KEYSYM_UNICODE_OFFSET
- XKB_KEYSYM_UNICODE_MIN
- XKB_KEYSYM_UNICODE_MAX
2024-01-10 14:18:43 +01:00
Pierre Le Marre dd65950f44 Provide ssize_t for Windows
`ssize_t` is not part of standard C but is specific to POSIX.
2023-12-19 07:45:26 +01:00
Pierre Le Marre 7185b023c5 Add documentation to XkbToControl
While some transformations are defined in the section “Interpreting the
Control Modifier” of the XKB protocol, a lot are not. Documentation is
scarce about the topic, so write down some context using the following
sources:

- https://en.wikipedia.org/wiki/Caret_notation#Use_in_software
- https://www.vt100.net/shuford/terminal/dec_keyboards_news.txt
- https://misc.openbsd.narkive.com/NvSWf6ax/which-key-shortcuts-are-safe-to-bind-and-some-q-s-about-history-and-os-diffs-re-ctrl-4-means
- https://vt100.net/docs/vt220-rm/chapter3.html#T3-5
2023-12-19 07:01:04 +01:00
Pierre Le Marre 20329baf4d xkbcomp: Use `steal` for better memory handling 2023-12-04 10:44:52 +10:00
Pierre Le Marre 20c6fa62a7 registry: Use `steal` for better memory handling 2023-12-04 10:44:52 +10:00
Pierre Le Marre dbc187cfa1 utils: Steal `steal` from libei
Add excerpt of `util-mem.h` from libei defining the macro `steal`, in
order to improve memory management and the code semantics.

See: 38132d6fc5/src/util-mem.h (L92)
2023-12-04 10:44:52 +10:00
Pierre Le Marre 7caf57f013 registry: Parse “popularity” attribute
Previously the attribute “popularity” was completely ignored. It also
did not respect the modified DTD, because its default value depends if
we are currently parsing an “extras” rules file.

Fixed:
- Always parse the popularity attribute.
- Change the DTD to reflect that the default value is implied.
2023-11-21 08:10:22 +01:00
Pierre Le Marre 0a577a0998 xkbcli-compile-compose: Fix string result escaping
Currently the result string is not escaped and may produce invalid
results.

Fixed by introducing an ad-hoc escape function and relative tests.
2023-11-19 09:57:24 +01:00
Pierre Le Marre 00e3058e7b Prevent recursive includes of keymap components
- Add check for recursive includes of keymap components. It relies on
  limiting the include depth. The threshold is currently to 15, which
  seems reasonable with plenty of margin for keymaps in the wild.
- Add corresponding new log message `recursive-include`.
- Add tests for recursive includes.
2023-11-06 22:06:25 +01:00
Pierre Le Marre 171e0170c2 Fix memory leak in FindFileInXkbPath
The string `buf` was not freed after each call to `asprintf_safe`.

Avoid allocating and introduce the new message: `XKB_ERROR_INSUFFICIENT_BUFFER_SIZE`.
2023-11-06 21:44:41 +01:00
Pierre Le Marre 2b71431007 Fix memory leaks in key types compilation
When there is no error the types are “stolen” and copied to the keymap.
But when there is an error, `MergeIncludedKeyTypes` just return without
“stealing” nor freeing the types.

Fixed by explicitly freeing the key types.

Fixed another leak in `HandleKeyTypeDef` that may occur if there is an
error in parsing a type definition.
2023-11-06 21:44:41 +01:00
Sam James fed96378a1 state: fix -Walloc-size
GCC 14 introduces a new -Walloc-size included in -Wextra which gives:
```
src/state.c:589:9: warning: allocation of insufficient size ‘1’ for type ‘struct xkb_state’ with size ‘128’ [-Walloc-size]
```

The calloc prototype is:
```
void *calloc(size_t nmemb, size_t size);
```

So, just swap the number of members and size arguments to match the prototype, as
we're initialising 1 struct of size `sizeof(struct xkb_state)`. GCC then sees we're not
doing anything wrong.

Signed-off-by: Sam James <sam@gentoo.org>
2023-11-06 12:32:46 +10:00
Pierre Le Marre 3aaa4e2a53 rules: early detection of invalid encoding 2023-11-03 04:16:50 +01:00
Pierre Le Marre 82e9293e12 xkbcomp: early detection of invalid encoding 2023-11-03 04:16:50 +01:00
Pierre Le Marre a2da57aba1 Compose: early detection of invalid encoding
Also move “unrecognized token” error message before skiping the
line, in order to fix token position.
2023-11-03 04:16:50 +01:00
Pierre Le Marre 6c54681fe7 Compose: Fix UTF-8 BOM detection
The leading UTF-8 BOM detection code is misplaced as it is executed
after each EOL.

Fixed by moving the code before the goto labels.
2023-10-30 12:44:55 +01:00
Pierre Le Marre 9e88718080 rules: skip heading UTF-8 encoded BOM (U+FEFF)
Leading BOM is legal and is used as a signature — an indication that
an otherwise unmarked text file is in UTF-8.
See: https://www.unicode.org/faq/utf_bom.html#bom5 for further details.
2023-10-30 07:15:13 +01:00
Pierre Le Marre f937c30820 xkbcomp: skip heading UTF-8 encoded BOM (U+FEFF)
Leading BOM is legal and is used as a signature — an indication that
an otherwise unmarked text file is in UTF-8.
See: https://www.unicode.org/faq/utf_bom.html#bom5 for further
details.
2023-10-30 07:15:13 +01:00
Pierre Le Marre 59886e4183 Compose: skip heading UTF-8 encoded BOM (U+FEFF)
Leading BOM is legal and is used as a signature — an indication that
an otherwise unmarked text file is in UTF-8.
See: https://www.unicode.org/faq/utf_bom.html#bom5 for further details.
2023-10-30 07:15:13 +01:00
Pierre Le Marre 238d132406 Keysyms: Fix missing hpYdiaeresis
The handling of keysym name guards (e.g. `#ifndef XK_Ydiaeresis`) was
incomplete and led to a missing keysym.

Make `sripts/makeheader` more robust to C macros handling.
2023-10-06 09:37:40 +02:00
Pierre Le Marre 49690d936b Keysyms: Update using latest xorgproto
xorgproto commit: 1c8128d72df22843a2022576850bc5ab5e3a46ea.
2023-09-28 07:48:37 +02:00
Pierre Le Marre 9d15c6a7a1 Show invalid escape sequences
It is easier to debug when the message actually displays the offending
escape sequence.
2023-09-26 17:25:49 +02:00
Pierre Le Marre 0038c86607 Prevent overflow of octal escape sequences
The octal parser accepts the range `\1..\777`. The result is cast to
`char` which will silently overflow.

This commit prevents overlow and will treat `\400..\777` as invalid
escape sequences.
2023-09-26 17:25:49 +02:00
Pierre Le Marre ca7aa69cc0 Disallow producing NULL character with escape sequences
NULL usually terminates the strings; allowing to produce it via escape
sequences may lead to undefined behaviour.

- Make NULL escape sequences (e.g. `\0` and `\x0`) invalid.
- Add corresponding test.
- Introduce the new message: XKB_WARNING_INVALID_ESCAPE_SEQUENCE.
2023-09-26 17:25:49 +02:00
Ran Benita a17701327e Compose: add iterator API
Allow users to iterate the entries in a compose table. This is useful
for other projects which want programmable access to the sequences,
without having to write their own parser.

- New API:
  - `xkb_compose_table_entry_sequence`;
  - `xkb_compose_table_entry_keysym`;
  - `xkb_compose_table_entry_utf8`;
  - `xkb_compose_table_iterator_new`;
  - `xkb_compose_table_iterator_free`;
  - `xkb_compose_table_iterator_next`.
- Add tests in `test/compose.c`.
- Add benchmark for compose traversal.
- `tools/compose.c`:
  - Print entries instead of just validating them.
  - Add `--file` option.
  - TODO: make this tool part of the xkbcli commands.

Co-authored-by: Pierre Le Marre <dev@wismill.eu>
Co-authored-by: Ran Benita <ran@unusedvar.com>
Signed-off-by: Ran Benita <ran@unusedvar.com>
2023-09-26 09:02:45 +02:00
Pierre Le Marre c0065c95a4 Messages: merge macros with and without message code
Previously we had two types of macros for logging: with and without
message code. They were intended to be merged afterwards.

The idea is to use a special code – `XKB_LOG_MESSAGE_NO_ID = 0` – that
should *not* be displayed. But we would like to avoid checking this
special code at run time. This is achieved using macro tricks; they
are detailed in the code (see: `PREPEND_MESSAGE_ID`).

Now it is also easier to spot the remaining undocumented log entries:
just search `XKB_LOG_MESSAGE_NO_ID`.
2023-09-24 09:09:24 +02:00
Pierre Le Marre a83d745b62 Messages: add new messages to registry
This commit is another step to identify and document the maximum number
of logging messages. Bulk changes:

- Rename `conflicting-key-type` to `conflicting-key-type-merging-groups`.
  Giving more context in the name allow us to introduce
  `conflicting-key-type-definitions` later.
- Add conflicting-key-type-definitions
- Add conflicting-key-type-map-entry
- Add undeclared-modifiers-in-key-type
  Also improve the log messages.
- Add conflicting-key-type-preserve-entries
- Use XKB_ERROR_UNSUPPORTED_MODIFIER_MASK
- Add illegal-key-type-preserve-result
- Add conflicting-key-type-level-names
- Add duplicate-entry
- Add unsupported-symbols-field
- Add missing-symbols-group-name-index
- Use XKB_ERROR_WRONG_FIELD_TYPE
- Add conflicting-key-name
- Use XKB_WARNING_UNDEFINED_KEYCODE
- Add illegal-keycode-alias
- Add unsupported-geometry-section
- Add missing-default-section
- Add XKB_LOG_MESSAGE_NO_ID
- Rename log_vrb_with_code to log_vrb
- Use ERROR_WRONG_FIELD_TYPE & ERROR_INVALID_SYNTAX
- Add unknown-identifier
- Add invalid-expression-type
- Add invalid-operation + fixes
- Add unknown-operator
- Rename ERROR_UNKNOWN_IDENTIFIER to ERROR_INVALID_IDENTIFIER
- Add undeclared-virtual-modifier
- Add expected-array-entry
- Add invalid-include-statement
- Add included-file-not-found
- Add allocation-error
- Add invalid-included-file
- Process symbols.c
- Add invalid-value
- Add invalid-real-modifier
- Add unknown-field
- Add wrong-scope
- Add invalid-modmap-entry
- Add wrong-statement-type
- Add conflicting-key-symbols-entry
- Add invalid-set-default-statement
2023-09-24 09:09:24 +02:00
Pierre Le Marre eafd3aceca Add a new warning for numeric keysyms
Usually it is better to use the corresponding human-friendly keysym
names. If there is none, then the keysym is most probably not
supported in the ecosystem. The only use case I see is similar to the
PUA in Unicode (see: https://en.wikipedia.org/wiki/Private_Use_Areas).
I am not aware of examples of this kind of use.
2023-09-19 08:49:52 +02:00
Pierre Le Marre ef81d04eef Structured log messages with a message registry
Currently there is little structure in the log messages, making
difficult to use them for the following use cases:

- A user looking for help about a log message: the user probably
  uses a search engine, thus the results will depend on the proper
  indexing of our documentation and the various forums. It relies
  only on the wording of the message, which may change with time.
- A user wants to filter the logs resulting of the use of one of the
  components of xkbcommon. A typical example would be testing
  xkeyboard-config against libxkbcommon. It requires the use of a
  pattern (simple words detection or regex). The issue is that the
  pattern may become silently out-of-sync with xkbcommon.

A common practice (e.g. in compilers) is to assign unique error codes
to reference theses messages, along with an error index for
documentation.

Thus this commit implements the following features:

- Create a message registry (message-registry.yaml) that defines the
  log messages produced by xkbcommon. This is a simple YAML file that
  provides, for each message:

  - A unique numeric code as a short identifier. It is used in the
    output message and thus can be easily be filtered to spot errors
    or searched in the internet. It must not change: if the
    semantics of message changes, it is better to introduce a new
    message for clarity.
  - A unique text identifier, meant for two uses:

    1. Generate constants dealing with log information in our code
       base.
    2. Generate human-friendly names for the documentation.

  - A type: currently warning or error. Used to prefix the constants
    (see hereinabove) and for basic classification in documentation.
  - A short description, used as concise and mandatory documentation.
  - An optionnal detailed description.
  - Optional examples, intended to help the user to fix issues
    themself.
  - Version of xkbcommon it was added. For old entries this often
    unknown, so they will default to 1.0.0.
  - Version of xkbcommon it was removed (optional)

  No entry should ever be deleted from this index, even if the message
  is not used anymore: it ensures we have unique identifiers along the
  history of xkbcommon, and that users can refer to the documentation
  even for older versions.

- Add the script update-message-registry.py to generate the following
  files:

  - messages.h: message code enumeration for the messages currently
    used in the code base. Currently a private API.
  - message.registry.md: the error index documentation page.

- Modify the logging functions to use structured messages. This is a
  work in progress.
2023-09-19 08:49:52 +02:00
M Kelly e7f02d322a
parser: change deprecated `%pure-parser` to `%define api.pure` (#370)
This is now supported by byacc since version 2.0 20230516
2023-08-05 22:29:36 +03:00
Pierre Le Marre 7a815ad307 Fix parsing of numeric keysyms in ExprResolveKeySym
`ExprResolveKeySym` in `expr.c` does not parse non-digit numeric
keysyms.

Fixed by checking upper bound; also add warning messages.
2023-07-14 09:22:24 +02:00
Pierre Le Marre 0da68bc648 Simplify parsing of numeric keysyms in parser.y
In `parser.y`, a numeric keysym is parsed by formatting it in its
hexadecimal form then parsed as a keysym name. This is convoluted.

Fixed by checking directly the upper bound.
2023-07-14 09:22:24 +02:00
Pierre Le Marre 397e7e013d Fix xkb_keysym_from_name for numeric keysyms
When parsing hexadecimal keysym using `xkb_keysym_from_name`,
the result is limited by `parse_keysym_hex` to 0xffffffff, but the
maximum keysym is XKB_MAX_KEYSYM, i.e. 0x1fffffff.

Fixed by adding an upper bound.
2023-07-14 09:22:24 +02:00
Pierre Le Marre 4823838fa3 Move STRINGIFY to utils.h and add STRINGIFY2 2023-07-14 09:22:24 +02:00
Pierre Le Marre e811743ff2 Add XKB_KEYSYM_MIN and XKB_KEYSYM_MAX
Keysyms are 32-bit integers with the 3 most significant bits always set
to zero. See: Appendix A “KEYSYM Encoding” of the X Window System
Protocol at https://www.x.org/releases/current/doc/xproto/x11protocol.html#keysym_encoding.

Add a new constants XKB_KEYSYM_MIN and XKB_KEYSYM_MAX to make the
interval of valid keysyms more obvious in the code.
2023-07-14 09:22:24 +02:00
alois31 f3210cbf27
compose: drop the 65535 node limit (#343)
In commit 1638409b22, the number of
compose nodes was limited to 65535 to enable "future optimizations",
which apparently means slightly reduced memory usage due to fitting in
a uint16_t. At this time, it was mentioned that the author was not
aware of "any compose files which come close".

However, I'm one of the users that actually do require a larger number
of nodes for their compose file. Thus, use a uint32_t again and raise
the limit significantly.
2023-05-15 15:23:16 +03:00