Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnux32
Compiler: gcc-10.1.0 -mx32
Compilation CFLAGS: -O2 -Wno-parentheses -Wno-format-security
uname output: Linux loucetios 5.7.9 #1 SMP @1590968955 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnux32
Bash Version: 5.0
Patch Level: 18
Release Status: release
Description:
In lib/glob/smatch.c, there are two functions that are used to
check equivalence classes in patterns: collequiv, and
collequiv_wc. The former is used if the pattern does not contain
any multi-byte characters, the latter otherwise, With
exceptions that are not relevant to this bug. The two functions
do not give the same results: collequiv does not implement the
fnmatch() fallback code that collequiv_wc does implement,
leading to inconsistent matching for ASCII-only equivalence
classes.
(This is not something I encountered in a real script. I am
implementing equivalence class support myself, using fnmatch()
as the main check rather than as a fallback, and comparing the
results to those of other shells.)
Repeat-By:
case a in [[=A=]]) echo match 1 ;; esac
case aá in [[=A=]]á) echo match 2 ;; esac
In locales where A and a are not in the same equivalence class,
this should print nothing. glibc's ja_JP.UTF-8 is such a locale.
The C locale is such a locale as well, but it does not allow
for the á character, so may be bad for testing.
In locales where A and a are in the same equivalence class, this
should print "match 1" and "match 2". glibc's en_US.UTF-8 is
such a locale.
What actually happens in glibc's en_US.UTF-8 locale is that only
"match 2" is printed.
Fix:
Copy the FNMATCH_EQUIV_FALLBACK logic from collequiv_wc to
collequiv. _fnmatch_fallback_wc may be copied to create a non-wc
version of it, but it also works to have collequiv call
_fnmatch_fallback_wc by converting characters to wide
characters.