Hi Arnold, > > > (And how I've documented things in the manual, also since forever.) > > > > If you want the behaviour of the GNU regex to be stable over time, you > > should contribute unit tests to tests/test-regex.c. > > This is a separate issue. It almost sounds like you're saying "it's your > fault there's a bug here, you didn't contribute unit tests".
I'm not talking about past incidents and "fault", because that is generally useless. I'm talking about the future and what we can do to avoid that packages that depend on the 'regex' module see regressions. If a Gnulib module does not have a decent test coverage in Gnulib, then its bugs and regressions become apparent only after a while and only through these other packages. A good example of this sequence of events was <https://lists.gnu.org/archive/html/bug-gnulib/2020-07/msg00036.html>, but I'm sure you can find many others of the same kind in the mailing list archive. If, on the other hand, there is a unit test and it runs on glibc platforms, a regression is likely to be visible in the weekly continuous integration build <https://gitlab.com/gnulib/gnulib-ci/-/pipelines>. For the regex module, with 20 KB of tests for 300 KB of code full of complex algorithms, the test coverage is very thin, and it is *to be expected* that regressions are only visible once the code is integrated into gawk, grep, sed, etc. Similarly for the 'dfa' module with 5 KB of tests for 140 KB of code. The regex and dfa modules are being maintained here (by Paul, with contributions from various people), and we have seen that it is not obvious whether a patch is good or not: sometimes Paul has rejected patches, sometimes he had to revert patches. I think it would be good if these two modules had a larger test coverage, and I'm inviting everyone who can to contribute to these unit tests. > I hope that's not your intent; if it is then sorry, I don't buy it. The module doesn't have tests for the RE_SYNTAX_AWK RE_SYNTAX_GNU_AWK RE_SYNTAX_POSIX_AWK syntaxes. It's gawk which depends on the correct functioning of these syntaxes, not glibc, not grep, not sed, not emacs. Therefore IMO if the gawk developers don't contribute some test cases for these syntaxes, no one will. (I certainly won't, because I find writing tests a bit boring, and I don't see why I should have the "boring" part whereas others have the "fun" part :-) ) Bruno