Hi,

Another %n fix.

There's actually a change in behavior, which I can't explain:

BEFORE:
$ ExplicateUTF8 CanterburyPieces.txt
The sequence 0xEF     0xBB     0xBF
             11101111 10111011 10111111
is a valid UTF-8 character encoding equivalent to UTF32 0x0000FEFF.
The first byte tells us that there should be 2
continuation bytes since it begins with 3 contiguous 1s.
There are 2 following bytes and all are valid
continuation bytes since they all have high bits 10.
The first byte contributes its low 4 bits.
The remaining bytes each contribute their low 6 bits,
for a total of 16 bits: 1111 111011 111111
Abort trap

AFTER:
$ ExplicateUTF8 CanterburyPieces.txt
The sequence 0xEF     0xBB     0xBF
             11101111 10111011 10111111
is a valid UTF-8 character encoding equivalent to UTF32 0x0000FEFF.
The first byte tells us that there should be 2
continuation bytes since it begins with 3 contiguous 1s.
There are 2 following bytes and all are valid
continuation bytes since they all have high bits 10.
The first byte contributes its low 4 bits.
The remaining bytes each contribute their low 6 bits,
for a total of 16 bits: 1111 111011 111111
This is padded to 32 places with 16 zeros: 
0000000000000000000000000000000000000000000000001111111011111111
                                            0   0   0   0   F   E   F   F


If anyone wants to try it, my test file is here:
https://codevoid.de/0/p/CanterburyPieces.txt

OK?

Best regards,
Stefan

Index: misc/uniutils/Makefile
===================================================================
RCS file: /cvs/ports/misc/uniutils/Makefile,v
retrieving revision 1.9
diff -u -p -u -p -r1.9 Makefile
--- misc/uniutils/Makefile      28 Jun 2021 21:34:19 -0000      1.9
+++ misc/uniutils/Makefile      10 Sep 2021 21:09:48 -0000
@@ -3,7 +3,7 @@
 COMMENT=       Unicode utilities
 
 DISTNAME=      uniutils-2.27
-REVISION=      3
+REVISION=      4
 CATEGORIES=    misc
 
 HOMEPAGE=      http://billposer.org/Software/unidesc.html
Index: misc/uniutils/patches/patch-ExplicateUTF8_c
===================================================================
RCS file: misc/uniutils/patches/patch-ExplicateUTF8_c
diff -N misc/uniutils/patches/patch-ExplicateUTF8_c
--- /dev/null   1 Jan 1970 00:00:00 -0000
+++ misc/uniutils/patches/patch-ExplicateUTF8_c 10 Sep 2021 21:09:48 -0000
@@ -0,0 +1,17 @@
+$OpenBSD$
+
+Remove %n format specifier
+
+Index: ExplicateUTF8.c
+--- ExplicateUTF8.c.orig
++++ ExplicateUTF8.c
+@@ -214,7 +214,8 @@ main(int ac, char **av){
+     printf("%s ",tempstr); 
+   }
+   printf("\n");
+-  printf("This is padded to 32 places with %d zeros: 
%n%s\n",(32-GotBits),&spaces,binfmtl(ch));
++  spaces = printf("This is padded to 32 places with %d zeros: 
%s\n",(32-GotBits),binfmtl(ch));
++  spaces -= strlen(binfmtl(ch));
+   sprintf(tempstr,"                                ");
+   sprintf(tempstr,"%08lX",ch);
+   tempstr[28] = tempstr[7];

Reply via email to