On 03/10/11 16:28, Stuart Henderson wrote:
> good find.
> 
> after reading posix 2008 on this it isn't clear to me what is
> specified, but GNU m4 is clear in the documentation that _they_
> apply it non-recursively.
> 
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/m4.html
> http://www.gnu.org/software/m4/manual/html_node/Translit.html
> 
> so imo we definitely want this for -g mode and need to consider
> carefully whether to do it always (in which case the comment above
> the start of map(), which explains why this is done, would also
> need adjusting).
> 
> the following from MirOS may also be of interest:
> 
> http://junkpile.org/14.patch "fix trace lineno output for 'macro\n'"
> http://junkpile.org/15.patch "let 'errprint' in 'm4 -g' mode behave like GNU"
> http://junkpile.org/16.patch "fix another line number problem"
> 
> i'll try and have a play with this and hopefully Marc will
> have some time to look at it soon.
> 
> 
Hi,

xenocara build completed, no issue introduced there, also rebuilt a number of 
ports.

Checking the info m4 on OpenBSD looks to be the same as gnu m4 manual page for
translit. According to the last example does not do recursion

"translit(`abcdef', `aabdef', `bcged')

The final example shows that `a' is mapped to `b', not `c'; the resulting `b' is
not further remapped to `g'; the `d' and `e' are swapped, and the `f' is
discarded."


"translit(`+,-12345', `+--1-5', `<;>a-c-a')

This fails to give the correct reasult on OpenBSD a-c-a expected to be
equivalent to abcba, -a on the end is treated as -a." Back to back ranges not
implemented yet.

I think the info m4 examples should be in the regression tests. All but the back
to back range worked (-g only gnu), with the updated version attached, I have
fixed that as below, result from test m4 vs gm4 as per m4 info.

$ sh test_translit.sh
[HAVE_abc def h/]   [HAVE_abc def h/]
[HAVE_abc~def~h/]   [HAVE_abc~def~h/]
[HAVE_abc/def~h/]   [HAVE_abc/def~h/]
[HAVE_abc/def;h/]   [HAVE_abc/def;h/]
[HAVE_ABC/def;h/]   [HAVE_ABC/def;h/]
[HAVE_ABCZdef;hZ]   [HAVE_ABCZdef;hZ]
ABCDEFGHIJ          ABCDEFGHIJ
ABCDEFGHIJ          ABCDEFGHIJ
ABC-0980-ZYX        ABC-0980-ZYX
ABC-0980-ZYX        ABC-0980-ZYX
s not nix           s not nix
GNUS NOT UNIX       GNUS NOT UNIX
tmfs not fnix       tmfs not fnix
<;>abcba            <;>abcba
bgced               bgced

I will have a look at the patches.

Regards

Nigel Taylor




Index: usr.bin/m4/eval.c
===================================================================
RCS file: /home/cvs/src/usr.bin/m4/eval.c,v
retrieving revision 1.68
diff -u -p -r1.68 eval.c
--- usr.bin/m4/eval.c   7 Sep 2010 19:58:09 -0000       1.68
+++ usr.bin/m4/eval.c   10 Mar 2011 20:18:56 -0000
@@ -884,21 +884,11 @@ dosub(const char *argv[], int argc)
  * language. Within mapvec, we replace every character of "from" with
  * the corresponding character in "to". If "to" is shorter than "from",
  * than the corresponding entries are null, which means that those
- * characters dissapear altogether. Furthermore, imagine
- * map(dest, "sourcestring", "srtin", "rn..*") type call. In this case,
- * `s' maps to `r', `r' maps to `n' and `n' maps to `*'. Thus, `s'
- * ultimately maps to `*'. In order to achieve this effect in an efficient
- * manner (i.e. without multiple passes over the destination string), we
- * loop over mapvec, starting with the initial source character. if the
- * character value (dch) in this location is different than the source
- * character (sch), sch becomes dch, once again to index into mapvec, until
- * the character value stabilizes (i.e. sch = dch, in other words
- * mapvec[n] == n). Even if the entry in the mapvec is null for an ordinary
- * character, it will stabilize, since mapvec[0] == 0 at all times. At the
- * end, we restore mapvec* back to normal where mapvec[n] == n for
- * 0 <= n <= 127. This strategy, along with the restoration of mapvec, is
- * about 5 times faster than any algorithm that makes multiple passes over
- * destination string.
+ * characters dissapear altogether. 
+ * The recursion has been removed to match gnu m4 implementation and
+ * matches the m4 info details. 
+ * At the end, we restore mapvec* back to normal where mapvec[n] == n for
+ * 0 <= n <= 255. 
  */
 static void
 map(char *dest, const char *src, const char *from, const char *to)
@@ -958,10 +948,6 @@ map(char *dest, const char *src, const c
                while (*src) {
                        sch = (unsigned char)(*src++);
                        dch = mapvec[sch];
-                       while (dch != sch) {
-                               sch = dch;
-                               dch = mapvec[sch];
-                       }
                        if ((*dest = (char)dch))
                                dest++;
                }
@@ -993,7 +979,7 @@ handledash(char *buffer, char *end, cons
                        unsigned char i;
                        if ((unsigned char)src[0] <= (unsigned char)src[2]) {
                                for (i = (unsigned char)src[0]; 
-                                   i <= (unsigned char)src[2]; i++) {
+                                   i < (unsigned char)src[2]; i++) {
                                        *p++ = i;
                                        if (p == end) {
                                                *p = '\0';
@@ -1002,7 +988,7 @@ handledash(char *buffer, char *end, cons
                                }
                        } else {
                                for (i = (unsigned char)src[0]; 
-                                   i >= (unsigned char)src[2]; i--) {
+                                   i > (unsigned char)src[2]; i--) {
                                        *p++ = i;
                                        if (p == end) {
                                                *p = '\0';
@@ -1010,7 +996,36 @@ handledash(char *buffer, char *end, cons
                                        }
                                }
                        }
-                       src += 3;
+                       src += 2;
+/*                     check for back to back range */
+                        if (src[1] == '-' && src[2]) {
+                               if ((unsigned char)src[0] <= (unsigned 
char)src[2]) {
+                                       for (i = (unsigned char)src[0]; 
+                                           i <= (unsigned char)src[2]; i++) {
+                                               *p++ = i;
+                                               if (p == end) {
+                                                       *p = '\0';
+                                                       return buffer;
+                                               }
+                                       }
+                               } else {
+                                       for (i = (unsigned char)src[0]; 
+                                           i >= (unsigned char)src[2]; i--) {
+                                               *p++ = i;
+                                               if (p == end) {
+                                                       *p = '\0';
+                                                       return buffer;
+                                               }
+                                       }
+                               }
+                        }
+                        else {
+                               *p++ = *src++;
+                               if (p == end) {
+                                       *p = '\0';
+                                       return buffer;
+                               }
+                        }
                } else
                        *p++ = *src++;
                if (p == end)
Index: regress/usr.bin/m4/Makefile
===================================================================
RCS file: /home/cvs/src/regress/usr.bin/m4/Makefile,v
retrieving revision 1.28
diff -u -p -r1.28 Makefile
--- regress/usr.bin/m4/Makefile 23 Mar 2010 20:11:52 -0000      1.28
+++ regress/usr.bin/m4/Makefile 10 Mar 2011 14:22:45 -0000
@@ -12,7 +12,7 @@ REGRESS_TARGETS= test-ff_after_dnl test-
     test-weird test-args test-args2 test-esyscmd test-eval test-gnupatterns \
     test-gnupatterns2 test-comments test-synch1 test-synch1bis \
     test-gnuformat test-includes test-dumpdef test-gnuprefix \
-    test-translit
+    test-translit test-gnutranslit
 
 test-ff_after_dnl: ff_after_dnl.m4
        ${M4} ff_after_dnl.m4 | diff - ${.CURDIR}/ff_after_dnl.out
@@ -102,6 +102,9 @@ test-dumpdef:
 test-gnuprefix:
        ${M4} -P ${.CURDIR}/gnuprefix.m4 2>&1 | \
                diff -u - ${.CURDIR}/gnuprefix.out
+
+test-gnutranslit:
+       ${M4} -g ${.CURDIR}/gnutranslit.m4 | diff -u - 
${.CURDIR}/gnutranslit.out
 
 .PHONY:        ${REGRESS_TARGETS}
 
Index: regress/usr.bin/m4/gnutranslit.m4
===================================================================
RCS file: regress/usr.bin/m4/gnutranslit.m4
diff -N regress/usr.bin/m4/gnutranslit.m4
--- /dev/null   1 Jan 1970 00:00:00 -0000
+++ regress/usr.bin/m4/gnutranslit.m4   10 Mar 2011 20:31:20 -0000
@@ -0,0 +1,13 @@
+translit(`[HAVE_abc/def.h
+]', `
+/.', `/  ')
+translit(`[HAVE_abc/def.h=]', `=/.', `/~~')
+translit(`0123456789', `0123456789', `ABCDEFGHIJ')
+translit(`0123456789', `[0-9]', `[A-J]')
+translit(`abc-0980-zyx', `abcdefghijklmnopqrstuvwxyz', 
`ABCDEFGHIJKLMNOPQRSTUVWXYZ') 
+translit(`abc-0980-zyx', `[a-z]', `[A-Z]') 
+translit(`GNUs not Unix', `A-Z')
+translit(`GNUs not Unix', `a-z', `A-Z')
+translit(`GNUs not Unix', `A-Z', `z-a')
+translit(`+,-12345', `+--1-5', `<;>a-c-a')
+translit(`abcdef', `aabdef', `bcged')
Index: regress/usr.bin/m4/gnutranslit.out
===================================================================
RCS file: regress/usr.bin/m4/gnutranslit.out
diff -N regress/usr.bin/m4/gnutranslit.out
--- /dev/null   1 Jan 1970 00:00:00 -0000
+++ regress/usr.bin/m4/gnutranslit.out  10 Mar 2011 20:31:28 -0000
@@ -0,0 +1,11 @@
+[HAVE_abc def h/]
+[HAVE_abc~def~h/]
+ABCDEFGHIJ
+ABCDEFGHIJ
+ABC-0980-ZYX 
+ABC-0980-ZYX 
+s not nix
+GNUS NOT UNIX
+tmfs not fnix
+<;>abcba
+bgced

Reply via email to