On Mon, Apr 07, 2025 at 12:46:11PM -0500, Eric Blake wrote: > And relying on a configure-time test of what m4 supports on the > packager's machine is not necessarily going to work when autoconf is > run on a developer's machine with a different version of m4. Which in > turn implies that it is desirable to be able to probe at runtime what > is supported, rather than being limited to a command-line switch. But > while it is easy to write a runtime probe on whether "regexp([{], > [\{1\}]) results in -1 (old m4, ergo \{ is a literal) or 0 (m4 that > has enabled repetition operator semantics),
Correcting myself: the above would return -1 regardless of whether \{ is literal or repetition (since a repetition is no good without an earlier sequence to repeat). If you want probe a -1 or 0 value as a witness of regex flavor, then the haystack and needle have to be something a bit smarter. As in: regexp([{{], [.\{1\}]) which returns -1 when \{ is literal and 0 when it is a repetition. > it doesn't work if doing > the probe itself triggers a warning when you have only opted in to the > portability diagnosis rather than the new semantics. This part is still true, if there is no runtime way to turn warnings on or off independently of changing syntax. > > > > > lib/autoconf/general.m4:[m4_if(m4_bregexp([$1], > > > [#\|\\\|`\|\(\$\|@S|@\)\((|{|@{:@\)]), [-1], > > > > > lib/m4sugar/m4sugar.m4: > > > [@\(\(<:\|:>\|S|\|%:\|\{:\|:\}\)\(@\)\|&t@\)], > > > > be *indifferent* to whether { or \{ is a literal or an operator. If you WANT a regex that is indifferent to { or \{ being a literal and the other a metacharacter, you can always use "[{]" which is guaranteed to be a literal regardless of the flavor of { and \{ outside []. If you WANT a regex that expresses repetition, your only solutions are "newer m4 where that regex is supported" or "spell out the repetitions yourself: "aaaaa" instead of "a\{5\}", or "\([ab][ab][ab]?\)" instead of "[ab]\{2,3\}". And since m4 is Turing complete, you could even pre-process any regex with \{digit\} into a corresponding regex that IS portable to older versions, although it may require adding yet another layer of \(\) grouping and thus rewriting the substitution of any \DIGIT back-refs, and the amount of work to process your regex into something usable would make regex even slower. Still, this makes me wonder - is a single warning good enough ("your code has \{, but this changes semantics depending on m4 version"), or do we want two orthogonal warnings? One that only occurs on \{ that is blatantly not a repetition (ie. when the next character is neither a digit nor comma, "this will fail to compile in the future; use bare { instead to continue to match literal"), and the other than only occurs on something that looks like a repetition (regardless of whether something else orthogonal can change the flavor of enabled or disabled, the warning of "this regex is not portable to all versions of m4, based on whether repetitions are enabled") -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org