On 03/10/11 16:28, Stuart Henderson wrote: > good find. > > after reading posix 2008 on this it isn't clear to me what is > specified, but GNU m4 is clear in the documentation that _they_ > apply it non-recursively. > > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/m4.html > http://www.gnu.org/software/m4/manual/html_node/Translit.html > > so imo we definitely want this for -g mode and need to consider > carefully whether to do it always (in which case the comment above > the start of map(), which explains why this is done, would also > need adjusting). > > the following from MirOS may also be of interest: > > http://junkpile.org/14.patch "fix trace lineno output for 'macro\n'" > http://junkpile.org/15.patch "let 'errprint' in 'm4 -g' mode behave like GNU" > http://junkpile.org/16.patch "fix another line number problem" > > i'll try and have a play with this and hopefully Marc will > have some time to look at it soon. > > Hi,
xenocara build completed, no issue introduced there, also rebuilt a number of ports. Checking the info m4 on OpenBSD looks to be the same as gnu m4 manual page for translit. According to the last example does not do recursion "translit(`abcdef', `aabdef', `bcged') The final example shows that `a' is mapped to `b', not `c'; the resulting `b' is not further remapped to `g'; the `d' and `e' are swapped, and the `f' is discarded." "translit(`+,-12345', `+--1-5', `<;>a-c-a') This fails to give the correct reasult on OpenBSD a-c-a expected to be equivalent to abcba, -a on the end is treated as -a." Back to back ranges not implemented yet. I think the info m4 examples should be in the regression tests. All but the back to back range worked (-g only gnu), with the updated version attached, I have fixed that as below, result from test m4 vs gm4 as per m4 info. $ sh test_translit.sh [HAVE_abc def h/] [HAVE_abc def h/] [HAVE_abc~def~h/] [HAVE_abc~def~h/] [HAVE_abc/def~h/] [HAVE_abc/def~h/] [HAVE_abc/def;h/] [HAVE_abc/def;h/] [HAVE_ABC/def;h/] [HAVE_ABC/def;h/] [HAVE_ABCZdef;hZ] [HAVE_ABCZdef;hZ] ABCDEFGHIJ ABCDEFGHIJ ABCDEFGHIJ ABCDEFGHIJ ABC-0980-ZYX ABC-0980-ZYX ABC-0980-ZYX ABC-0980-ZYX s not nix s not nix GNUS NOT UNIX GNUS NOT UNIX tmfs not fnix tmfs not fnix <;>abcba <;>abcba bgced bgced I will have a look at the patches. Regards Nigel Taylor
Index: usr.bin/m4/eval.c =================================================================== RCS file: /home/cvs/src/usr.bin/m4/eval.c,v retrieving revision 1.68 diff -u -p -r1.68 eval.c --- usr.bin/m4/eval.c 7 Sep 2010 19:58:09 -0000 1.68 +++ usr.bin/m4/eval.c 10 Mar 2011 20:18:56 -0000 @@ -884,21 +884,11 @@ dosub(const char *argv[], int argc) * language. Within mapvec, we replace every character of "from" with * the corresponding character in "to". If "to" is shorter than "from", * than the corresponding entries are null, which means that those - * characters dissapear altogether. Furthermore, imagine - * map(dest, "sourcestring", "srtin", "rn..*") type call. In this case, - * `s' maps to `r', `r' maps to `n' and `n' maps to `*'. Thus, `s' - * ultimately maps to `*'. In order to achieve this effect in an efficient - * manner (i.e. without multiple passes over the destination string), we - * loop over mapvec, starting with the initial source character. if the - * character value (dch) in this location is different than the source - * character (sch), sch becomes dch, once again to index into mapvec, until - * the character value stabilizes (i.e. sch = dch, in other words - * mapvec[n] == n). Even if the entry in the mapvec is null for an ordinary - * character, it will stabilize, since mapvec[0] == 0 at all times. At the - * end, we restore mapvec* back to normal where mapvec[n] == n for - * 0 <= n <= 127. This strategy, along with the restoration of mapvec, is - * about 5 times faster than any algorithm that makes multiple passes over - * destination string. + * characters dissapear altogether. + * The recursion has been removed to match gnu m4 implementation and + * matches the m4 info details. + * At the end, we restore mapvec* back to normal where mapvec[n] == n for + * 0 <= n <= 255. */ static void map(char *dest, const char *src, const char *from, const char *to) @@ -958,10 +948,6 @@ map(char *dest, const char *src, const c while (*src) { sch = (unsigned char)(*src++); dch = mapvec[sch]; - while (dch != sch) { - sch = dch; - dch = mapvec[sch]; - } if ((*dest = (char)dch)) dest++; } @@ -993,7 +979,7 @@ handledash(char *buffer, char *end, cons unsigned char i; if ((unsigned char)src[0] <= (unsigned char)src[2]) { for (i = (unsigned char)src[0]; - i <= (unsigned char)src[2]; i++) { + i < (unsigned char)src[2]; i++) { *p++ = i; if (p == end) { *p = '\0'; @@ -1002,7 +988,7 @@ handledash(char *buffer, char *end, cons } } else { for (i = (unsigned char)src[0]; - i >= (unsigned char)src[2]; i--) { + i > (unsigned char)src[2]; i--) { *p++ = i; if (p == end) { *p = '\0'; @@ -1010,7 +996,36 @@ handledash(char *buffer, char *end, cons } } } - src += 3; + src += 2; +/* check for back to back range */ + if (src[1] == '-' && src[2]) { + if ((unsigned char)src[0] <= (unsigned char)src[2]) { + for (i = (unsigned char)src[0]; + i <= (unsigned char)src[2]; i++) { + *p++ = i; + if (p == end) { + *p = '\0'; + return buffer; + } + } + } else { + for (i = (unsigned char)src[0]; + i >= (unsigned char)src[2]; i--) { + *p++ = i; + if (p == end) { + *p = '\0'; + return buffer; + } + } + } + } + else { + *p++ = *src++; + if (p == end) { + *p = '\0'; + return buffer; + } + } } else *p++ = *src++; if (p == end) Index: regress/usr.bin/m4/Makefile =================================================================== RCS file: /home/cvs/src/regress/usr.bin/m4/Makefile,v retrieving revision 1.28 diff -u -p -r1.28 Makefile --- regress/usr.bin/m4/Makefile 23 Mar 2010 20:11:52 -0000 1.28 +++ regress/usr.bin/m4/Makefile 10 Mar 2011 14:22:45 -0000 @@ -12,7 +12,7 @@ REGRESS_TARGETS= test-ff_after_dnl test- test-weird test-args test-args2 test-esyscmd test-eval test-gnupatterns \ test-gnupatterns2 test-comments test-synch1 test-synch1bis \ test-gnuformat test-includes test-dumpdef test-gnuprefix \ - test-translit + test-translit test-gnutranslit test-ff_after_dnl: ff_after_dnl.m4 ${M4} ff_after_dnl.m4 | diff - ${.CURDIR}/ff_after_dnl.out @@ -102,6 +102,9 @@ test-dumpdef: test-gnuprefix: ${M4} -P ${.CURDIR}/gnuprefix.m4 2>&1 | \ diff -u - ${.CURDIR}/gnuprefix.out + +test-gnutranslit: + ${M4} -g ${.CURDIR}/gnutranslit.m4 | diff -u - ${.CURDIR}/gnutranslit.out .PHONY: ${REGRESS_TARGETS} Index: regress/usr.bin/m4/gnutranslit.m4 =================================================================== RCS file: regress/usr.bin/m4/gnutranslit.m4 diff -N regress/usr.bin/m4/gnutranslit.m4 --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ regress/usr.bin/m4/gnutranslit.m4 10 Mar 2011 20:31:20 -0000 @@ -0,0 +1,13 @@ +translit(`[HAVE_abc/def.h +]', ` +/.', `/ ') +translit(`[HAVE_abc/def.h=]', `=/.', `/~~') +translit(`0123456789', `0123456789', `ABCDEFGHIJ') +translit(`0123456789', `[0-9]', `[A-J]') +translit(`abc-0980-zyx', `abcdefghijklmnopqrstuvwxyz', `ABCDEFGHIJKLMNOPQRSTUVWXYZ') +translit(`abc-0980-zyx', `[a-z]', `[A-Z]') +translit(`GNUs not Unix', `A-Z') +translit(`GNUs not Unix', `a-z', `A-Z') +translit(`GNUs not Unix', `A-Z', `z-a') +translit(`+,-12345', `+--1-5', `<;>a-c-a') +translit(`abcdef', `aabdef', `bcged') Index: regress/usr.bin/m4/gnutranslit.out =================================================================== RCS file: regress/usr.bin/m4/gnutranslit.out diff -N regress/usr.bin/m4/gnutranslit.out --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ regress/usr.bin/m4/gnutranslit.out 10 Mar 2011 20:31:28 -0000 @@ -0,0 +1,11 @@ +[HAVE_abc def h/] +[HAVE_abc~def~h/] +ABCDEFGHIJ +ABCDEFGHIJ +ABC-0980-ZYX +ABC-0980-ZYX +s not nix +GNUS NOT UNIX +tmfs not fnix +<;>abcba +bgced