In a hexdump, printf seems to add a 3 characters for the thousands separator:
#!/bin/sh for l in de_DE en_US nb_NO nn_NO ; do echo "LC_NUMERIC=$l.UTF-8" for n in 1 100 1000 10000 100000 1000000 10000000 ; do LC_NUMERIC=$l.UTF-8 /usr/bin/printf "<%'8d>" $n | hexdump -C done done Output: LC_NUMERIC=nb_NO.UTF-8 00000000 3c 20 20 20 20 20 20 20 31 3e |< 1>| 0000000a 00000000 3c 20 20 20 20 20 31 30 30 3e |< 100>| 0000000a 00000000 3c 20 31 e2 80 af 30 30 30 3e |< 1...000>| 0000000a 00000000 3c 31 30 e2 80 af 30 30 30 3e |<10...000>| 0000000a 00000000 3c 31 30 30 e2 80 af 30 30 30 3e |<100...000>| 0000000b 00000000 3c 31 e2 80 af 30 30 30 e2 80 af 30 30 30 3e |<1...000...000>| 0000000f 00000000 3c 31 30 e2 80 af 30 30 30 e2 80 af 30 30 30 3e |<10...000...000>| 00000010 LC_NUMERIC=nn_NO.UTF-8 00000000 3c 20 20 20 20 20 20 20 31 3e |< 1>| 0000000a 00000000 3c 20 20 20 20 20 31 30 30 3e |< 100>| 0000000a 00000000 3c 20 31 e2 80 af 30 30 30 3e |< 1...000>| 0000000a 00000000 3c 31 30 e2 80 af 30 30 30 3e |<10...000>| 0000000a 00000000 3c 31 30 30 e2 80 af 30 30 30 3e |<100...000>| 0000000b 00000000 3c 31 e2 80 af 30 30 30 e2 80 af 30 30 30 3e |<1...000...000>| 0000000f 00000000 3c 31 30 e2 80 af 30 30 30 e2 80 af 30 30 30 3e |<10...000...000>| 00000010 However, both in Konsole as well as in XTerm, the issue occurs. So, the bytes "0xe2 0x80 0xaf" inserted by printf for the thousands separator seem to be incorrect? "0xe2 0x80 0xaf" is UTF-8 NARROW NO-BREAK SPACE -> https://www.fileformat.info/info/ .unicode/char/202f/index.htm . -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to coreutils in Ubuntu. https://bugs.launchpad.net/bugs/2058775 Title: coreutils: printf formatting bug for nb_NO and nn_NO locales Status in coreutils package in Ubuntu: New Bug description: I just discovered a printf bug for at least the nb_NO and nn_NO locales when printing numbers with thousands separator. To reproduce: #!/bin/bash for l in de_DE en_US nb_NO ; do echo "LC_NUMERIC=$l.UTF-8" for n in 1 100 1000 10000 100000 1000000 10000000 ; do LC_NUMERIC=$l.UTF-8 /usr/bin/printf "<%'10d>\n" $n done done The expected output of "%'10d" is a right-formatted number string with 10 characters. The output of the test script is fine for e.g. LC_NUMERIC=de_DE.UTF-8 and LC_NUMERIC=en_US.UTF-8: LC_NUMERIC=de_DE.UTF-8 < 1> < 100> < 1.000> < 10.000> < 100.000> < 1.000.000> <10.000.000> LC_NUMERIC=en_US.UTF-8 < 1> < 100> < 1,000> < 10,000> < 100,000> < 1,000,000> <10,000,000> However, for LC_NUMERIC=nb_NO.UTF-8 and LC_NUMERIC=nn_NO.UTF-8, the formatting is wrong: LC_NUMERIC=nb_NO.UTF-8 < 1> < 100> < 1 000> < 10 000> < 100 000> <1 000 000> <10 000 000> LC_NUMERIC=nn_NO.UTF-8 < 1> < 100> < 1 000> < 10 000> < 100 000> <1 000 000> <10 000 000> I reproduced the issue with coreutils-8.32-4.1ubuntu1.1 (Ubuntu 22.04) as well as coreutils-9.3-5.fc39.x86_64 (Fedora 39). ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: coreutils 8.32-4.1ubuntu1.1 ProcVersionSignature: Ubuntu 6.5.0-26.26~22.04.1-generic 6.5.13 Uname: Linux 6.5.0-26-generic x86_64 ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: pass CurrentDesktop: KDE Date: Fri Mar 22 21:33:13 2024 InstallationDate: Installed on 2022-11-29 (479 days ago) InstallationMedia: Kubuntu 22.04.1 LTS "Jammy Jellyfish" - Release amd64 (20220809.1) SourcePackage: coreutils UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/2058775/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp