> * Paul Eggert <[EMAIL PROTECTED]> [2005-08-20 23:01:23 -0700]: > > Sam Steingold <[EMAIL PROTECTED]> writes: > >> the latest and greatest gnulib regexp has the following regressions vs >> the previous (monolithic) version: > > Sorry, I didn't understand the notation that you used in > <http://lists.gnu.org/archive/html/bug-gnulib/2005-08/msg00008.html>.
;; common lisp: (defun re-test (pattern string) (mapcar (lambda (match) (and match (regexp:match-string string match))) (multiple-value-list (regexp:regexp-exec (regexp:regexp-compile pattern :extended t) string)))) this function takes an extended regular expression pattern and a string and tries to match them, returning a list of substrings of the string that matched subexpressions of the pattern Form: (RE-TEST "(^)*" "-") ;; pattern = (^)* ;; string = - CORRECT: ("" "") ;; the previous (single-file) version returned two matches: for ;; the whole expression and for the first subexpressions, both ;; had length 0 CLISP : ("" NIL) ;; the current (multi-file) version returns just one match - ;; for the whole expression, no matches for the subexpression ;; this is the explanation of how ("" "") is different from ("" NIL) Differ at position 1: "" vs NIL CORRECT: ("") CLISP : (NIL) > I tried to reproduce the problems by writing a C program (enclosed > below) and it seems to me that the gnulib regexp is correct in all > these test cases. Perhaps the old regexp was broken. frankly I don't know and don't care whether the old or new was / is broken. All I care about is consistency. May I suggest that you add regression testing to the parts of gnulib that exhibit non-trivial functionality, like regex? Does glibc come with regression tests? Do those tests cover regex? Consistency over time - or at least explicitly documented changes - is quite important (IMNSHO). Actually, the careful examination of the examples appears to indicate that the previous behavior was "more" correct. Specifically, the first 3 of the 6 regressions are clearly bugs in the current regex implementation while the last 3 are acceptable - but undesirable - variations. -- Sam Steingold (http://www.podval.org/~sds) running w2k <http://www.iris.org.il> <http://www.palestinefacts.org/> <http://www.jihadwatch.org/> <http://www.openvotingconsortium.org/> Never succeed from the first try - if you do, nobody will think it was hard. _______________________________________________ bug-gnulib mailing list bug-gnulib@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnulib