On 2025-08-31 13:06, Mariusz Wodzicki via Cygwin wrote:
Description of the problem.
[0-9]  picks also certain Unicode superscript characters ( namely, ⁰ ⁴ ⁵ ⁶
⁷ ⁸ ⁹ ), and every Unicode subscript character.

Example: the directory has the following files:
$ /bin/ls
₀.txt  ₁.txt  ₂.txt  ₃.txt  ₄.txt  ₅.txt  ₆.txt  ₇.txt  ₈.txt  ₉.txt
⁰.txt  ¹.txt  ².txt  ³.txt  ⁴.txt  ⁵.txt  ⁶.txt  ⁷.txt  ⁸.txt  ⁹.txt

$ /bin/ls [0-9].txt
₀.txt  ₁.txt  ₃.txt  ⁴.txt  ⁵.txt  ⁶.txt  ⁷.txt  ⁸.txt
⁰.txt  ₂.txt  ₄.txt  ₅.txt  ₆.txt  ₇.txt  ₈.txt

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=

System.
Fully up to date Windows 11
cygwin 3.6.4-1
bash    5.2.21-1

For reproducible results prefix commands with LC_ALL=C … or possibly just LC_COLLATE=C or LC_CTYPE=C or =POSIX to standardize the locale, otherwise many commands will respect the current locale, and some respect Unicode regardless of locale e.g. `info wc`:

"Unless the environment variable ‘POSIXLY_CORRECT’ is set, GNU ‘wc’ treats the following Unicode characters as white space even if the current locale does not: U+00A0 NO-BREAK SPACE, U+2007 FIGURE SPACE, U+202F NARROW NO-BREAK SPACE, and U+2060 WORD JOINER."

For GNU utilities, where info pages are preferred, such as coreutils*, compiler and language processors, and tools packages, many details do not appear in the man pages, for example:

"Full documentation <https://www.gnu.org/software/coreutils/wc> or available locally via: info '(coreutils) wc invocation'"

although `info wc` shows the same page.

—————
* [ arch b2sum base32 base64 basename cat chcon chgrp chmod chown chroot cksum comm cp csplit cut date dd df dir dircolors dirname du echo env expand expr factor false fmt fold gkill groups head hostid id install join link ln logname ls md5sum mkdir mkfifo mknod mktemp mv nice nl nohup nproc numfmt od paste pathchk pinky pr printenv printf ptx pwd readlink realpath rm rmdir runcon seq sha1sum sha224sum sha256sum sha384sum sha512sum shred shuf sleep sort split stat stdbuf stty sum sync tac tail tee test timeout touch tr true truncate tsort tty uname unexpand uniq unlink users vdir wc who whoami yes

--
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retrancher  but when there is no more to cut
                                -- Antoine de Saint-Exupéry

--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

Reply via email to