On 03/10/11 20:38, Nigel Taylor wrote: > On 03/10/11 16:28, Stuart Henderson wrote: >> good find. >> >> after reading posix 2008 on this it isn't clear to me what is >> specified, but GNU m4 is clear in the documentation that _they_ >> apply it non-recursively. >> >> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/m4.html >> http://www.gnu.org/software/m4/manual/html_node/Translit.html >> >> so imo we definitely want this for -g mode and need to consider >> carefully whether to do it always (in which case the comment above >> the start of map(), which explains why this is done, would also >> need adjusting). >> >> the following from MirOS may also be of interest: >> >> http://junkpile.org/14.patch "fix trace lineno output for 'macro\n'" >> http://junkpile.org/15.patch "let 'errprint' in 'm4 -g' mode behave like GNU" >> http://junkpile.org/16.patch "fix another line number problem" >> >> i'll try and have a play with this and hopefully Marc will >> have some time to look at it soon. >> >> > Hi, > > xenocara build completed, no issue introduced there, also rebuilt a number of > ports. > > Checking the info m4 on OpenBSD looks to be the same as gnu m4 manual page for > translit. According to the last example does not do recursion > > "translit(`abcdef', `aabdef', `bcged') > > The final example shows that `a' is mapped to `b', not `c'; the resulting `b' > is > not further remapped to `g'; the `d' and `e' are swapped, and the `f' is > discarded." > > > "translit(`+,-12345', `+--1-5', `<;>a-c-a') > > This fails to give the correct reasult on OpenBSD a-c-a expected to be > equivalent to abcba, -a on the end is treated as -a." Back to back ranges not > implemented yet. > > I think the info m4 examples should be in the regression tests. All but the > back > to back range worked (-g only gnu), with the updated version attached, I have > fixed that as below, result from test m4 vs gm4 as per m4 info. > > $ sh test_translit.sh > [HAVE_abc def h/] [HAVE_abc def h/] > [HAVE_abc~def~h/] [HAVE_abc~def~h/] > [HAVE_abc/def~h/] [HAVE_abc/def~h/] > [HAVE_abc/def;h/] [HAVE_abc/def;h/] > [HAVE_ABC/def;h/] [HAVE_ABC/def;h/] > [HAVE_ABCZdef;hZ] [HAVE_ABCZdef;hZ] > ABCDEFGHIJ ABCDEFGHIJ > ABCDEFGHIJ ABCDEFGHIJ > ABC-0980-ZYX ABC-0980-ZYX > ABC-0980-ZYX ABC-0980-ZYX > s not nix s not nix > GNUS NOT UNIX GNUS NOT UNIX > tmfs not fnix tmfs not fnix > <;>abcba <;>abcba > bgced bgced > > I will have a look at the patches. > > Regards > > Nigel Taylor > > > > Hi,
I missed incrementing pointer, after back-to-back range. Regards Nigel Taylor
Index: usr.bin/m4/eval.c =================================================================== RCS file: /home/cvs/src/usr.bin/m4/eval.c,v retrieving revision 1.68 diff -u -p -r1.68 eval.c --- usr.bin/m4/eval.c 7 Sep 2010 19:58:09 -0000 1.68 +++ usr.bin/m4/eval.c 10 Mar 2011 20:58:34 -0000 @@ -884,21 +884,11 @@ dosub(const char *argv[], int argc) * language. Within mapvec, we replace every character of "from" with * the corresponding character in "to". If "to" is shorter than "from", * than the corresponding entries are null, which means that those - * characters dissapear altogether. Furthermore, imagine - * map(dest, "sourcestring", "srtin", "rn..*") type call. In this case, - * `s' maps to `r', `r' maps to `n' and `n' maps to `*'. Thus, `s' - * ultimately maps to `*'. In order to achieve this effect in an efficient - * manner (i.e. without multiple passes over the destination string), we - * loop over mapvec, starting with the initial source character. if the - * character value (dch) in this location is different than the source - * character (sch), sch becomes dch, once again to index into mapvec, until - * the character value stabilizes (i.e. sch = dch, in other words - * mapvec[n] == n). Even if the entry in the mapvec is null for an ordinary - * character, it will stabilize, since mapvec[0] == 0 at all times. At the - * end, we restore mapvec* back to normal where mapvec[n] == n for - * 0 <= n <= 127. This strategy, along with the restoration of mapvec, is - * about 5 times faster than any algorithm that makes multiple passes over - * destination string. + * characters dissapear altogether. + * The recursion has been removed to match gnu m4 implementation and + * matches the m4 info details. + * At the end, we restore mapvec* back to normal where mapvec[n] == n for + * 0 <= n <= 255. */ static void map(char *dest, const char *src, const char *from, const char *to) @@ -958,10 +948,6 @@ map(char *dest, const char *src, const c while (*src) { sch = (unsigned char)(*src++); dch = mapvec[sch]; - while (dch != sch) { - sch = dch; - dch = mapvec[sch]; - } if ((*dest = (char)dch)) dest++; } @@ -993,7 +979,7 @@ handledash(char *buffer, char *end, cons unsigned char i; if ((unsigned char)src[0] <= (unsigned char)src[2]) { for (i = (unsigned char)src[0]; - i <= (unsigned char)src[2]; i++) { + i < (unsigned char)src[2]; i++) { *p++ = i; if (p == end) { *p = '\0'; @@ -1002,7 +988,7 @@ handledash(char *buffer, char *end, cons } } else { for (i = (unsigned char)src[0]; - i >= (unsigned char)src[2]; i--) { + i > (unsigned char)src[2]; i--) { *p++ = i; if (p == end) { *p = '\0'; @@ -1010,7 +996,37 @@ handledash(char *buffer, char *end, cons } } } - src += 3; + src += 2; +/* check for back to back range */ + if (src[1] == '-' && src[2]) { + if ((unsigned char)src[0] <= (unsigned char)src[2]) { + for (i = (unsigned char)src[0]; + i <= (unsigned char)src[2]; i++) { + *p++ = i; + if (p == end) { + *p = '\0'; + return buffer; + } + } + } else { + for (i = (unsigned char)src[0]; + i >= (unsigned char)src[2]; i--) { + *p++ = i; + if (p == end) { + *p = '\0'; + return buffer; + } + } + } + src += 3; + } + else { + *p++ = *src++; + if (p == end) { + *p = '\0'; + return buffer; + } + } } else *p++ = *src++; if (p == end) Index: regress/usr.bin/m4/Makefile =================================================================== RCS file: /home/cvs/src/regress/usr.bin/m4/Makefile,v retrieving revision 1.28 diff -u -p -r1.28 Makefile --- regress/usr.bin/m4/Makefile 23 Mar 2010 20:11:52 -0000 1.28 +++ regress/usr.bin/m4/Makefile 10 Mar 2011 14:22:45 -0000 @@ -12,7 +12,7 @@ REGRESS_TARGETS= test-ff_after_dnl test- test-weird test-args test-args2 test-esyscmd test-eval test-gnupatterns \ test-gnupatterns2 test-comments test-synch1 test-synch1bis \ test-gnuformat test-includes test-dumpdef test-gnuprefix \ - test-translit + test-translit test-gnutranslit test-ff_after_dnl: ff_after_dnl.m4 ${M4} ff_after_dnl.m4 | diff - ${.CURDIR}/ff_after_dnl.out @@ -102,6 +102,9 @@ test-dumpdef: test-gnuprefix: ${M4} -P ${.CURDIR}/gnuprefix.m4 2>&1 | \ diff -u - ${.CURDIR}/gnuprefix.out + +test-gnutranslit: + ${M4} -g ${.CURDIR}/gnutranslit.m4 | diff -u - ${.CURDIR}/gnutranslit.out .PHONY: ${REGRESS_TARGETS} Index: regress/usr.bin/m4/gnutranslit.m4 =================================================================== RCS file: regress/usr.bin/m4/gnutranslit.m4 diff -N regress/usr.bin/m4/gnutranslit.m4 --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ regress/usr.bin/m4/gnutranslit.m4 10 Mar 2011 20:31:20 -0000 @@ -0,0 +1,13 @@ +translit(`[HAVE_abc/def.h +]', ` +/.', `/ ') +translit(`[HAVE_abc/def.h=]', `=/.', `/~~') +translit(`0123456789', `0123456789', `ABCDEFGHIJ') +translit(`0123456789', `[0-9]', `[A-J]') +translit(`abc-0980-zyx', `abcdefghijklmnopqrstuvwxyz', `ABCDEFGHIJKLMNOPQRSTUVWXYZ') +translit(`abc-0980-zyx', `[a-z]', `[A-Z]') +translit(`GNUs not Unix', `A-Z') +translit(`GNUs not Unix', `a-z', `A-Z') +translit(`GNUs not Unix', `A-Z', `z-a') +translit(`+,-12345', `+--1-5', `<;>a-c-a') +translit(`abcdef', `aabdef', `bcged') Index: regress/usr.bin/m4/gnutranslit.out =================================================================== RCS file: regress/usr.bin/m4/gnutranslit.out diff -N regress/usr.bin/m4/gnutranslit.out --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ regress/usr.bin/m4/gnutranslit.out 10 Mar 2011 20:31:28 -0000 @@ -0,0 +1,11 @@ +[HAVE_abc def h/] +[HAVE_abc~def~h/] +ABCDEFGHIJ +ABCDEFGHIJ +ABC-0980-ZYX +ABC-0980-ZYX +s not nix +GNUS NOT UNIX +tmfs not fnix +<;>abcba +bgced