Hi, Another %n fix.
There's actually a change in behavior, which I can't explain: BEFORE: $ ExplicateUTF8 CanterburyPieces.txt The sequence 0xEF 0xBB 0xBF 11101111 10111011 10111111 is a valid UTF-8 character encoding equivalent to UTF32 0x0000FEFF. The first byte tells us that there should be 2 continuation bytes since it begins with 3 contiguous 1s. There are 2 following bytes and all are valid continuation bytes since they all have high bits 10. The first byte contributes its low 4 bits. The remaining bytes each contribute their low 6 bits, for a total of 16 bits: 1111 111011 111111 Abort trap AFTER: $ ExplicateUTF8 CanterburyPieces.txt The sequence 0xEF 0xBB 0xBF 11101111 10111011 10111111 is a valid UTF-8 character encoding equivalent to UTF32 0x0000FEFF. The first byte tells us that there should be 2 continuation bytes since it begins with 3 contiguous 1s. There are 2 following bytes and all are valid continuation bytes since they all have high bits 10. The first byte contributes its low 4 bits. The remaining bytes each contribute their low 6 bits, for a total of 16 bits: 1111 111011 111111 This is padded to 32 places with 16 zeros: 0000000000000000000000000000000000000000000000001111111011111111 0 0 0 0 F E F F If anyone wants to try it, my test file is here: https://codevoid.de/0/p/CanterburyPieces.txt OK? Best regards, Stefan Index: misc/uniutils/Makefile =================================================================== RCS file: /cvs/ports/misc/uniutils/Makefile,v retrieving revision 1.9 diff -u -p -u -p -r1.9 Makefile --- misc/uniutils/Makefile 28 Jun 2021 21:34:19 -0000 1.9 +++ misc/uniutils/Makefile 10 Sep 2021 21:09:48 -0000 @@ -3,7 +3,7 @@ COMMENT= Unicode utilities DISTNAME= uniutils-2.27 -REVISION= 3 +REVISION= 4 CATEGORIES= misc HOMEPAGE= http://billposer.org/Software/unidesc.html Index: misc/uniutils/patches/patch-ExplicateUTF8_c =================================================================== RCS file: misc/uniutils/patches/patch-ExplicateUTF8_c diff -N misc/uniutils/patches/patch-ExplicateUTF8_c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ misc/uniutils/patches/patch-ExplicateUTF8_c 10 Sep 2021 21:09:48 -0000 @@ -0,0 +1,17 @@ +$OpenBSD$ + +Remove %n format specifier + +Index: ExplicateUTF8.c +--- ExplicateUTF8.c.orig ++++ ExplicateUTF8.c +@@ -214,7 +214,8 @@ main(int ac, char **av){ + printf("%s ",tempstr); + } + printf("\n"); +- printf("This is padded to 32 places with %d zeros: %n%s\n",(32-GotBits),&spaces,binfmtl(ch)); ++ spaces = printf("This is padded to 32 places with %d zeros: %s\n",(32-GotBits),binfmtl(ch)); ++ spaces -= strlen(binfmtl(ch)); + sprintf(tempstr," "); + sprintf(tempstr,"%08lX",ch); + tempstr[28] = tempstr[7];