> | [A-Z] isn't safe to use unless ... > > That's true to an extent, but we know here that the intent is to > match 'C' which is between A and Z in every locale in the universe. > Variations on A might not be, variations on Z might not be, and there > might be more than just the upper case English letters between A and Z, > included in the ragne (even including things which are not letters at > all, upper case or not, and lower case chars might be included) but we > can assume that for any real locale, 'C' will be in that range (real as > being one in use in the world, rather than one invented for the very > purpose of not including C in the collating sequence between A and Z)
So, embracing and extending your assumptions, we can also claim that the letter T is between A and Z in every locale in the universe, right? wooledg:~$ printf %s\\n {A..Z} | LC_COLLATE=et_EE.utf8 sort | tr '\n' ' ' A B C D E F G H I J K L M N O P Q R S Z T U V W X Y Isn't real life FUN? But perhaps you're right about the letter C specifically. Maybe that one letter just happens to lie between A and Z in every locale on Earth. I don't happen to know of any counter-examples... yet. Now, for the original poster: the meaning of [A-Z] and [a-z] did in fact change between bash 4 and bash 5. wooledg:~$ bash-4.4 -c 'LC_COLLATE=et_EE.utf8; [[ T = [A-Z] ]] && echo match' wooledg:~$ bash-5.0 -c 'LC_COLLATE=et_EE.utf8; [[ T = [A-Z] ]] && echo match' match This is yet one more reason you can't rely on [A-Z] or [a-z] to work as expected in scripts. Even between different versions of bash, within the same locale, on the same computer, it doesn't behave consistently. I strongly recommend switching to [[:upper:]] and friends, unless you always work in the C locale (and explicitly set it in your scripts).