Steve Long wrote: > No, that's about the size of it-- if you you'd like to tie it to ASCII, > irrespective of locale, that's fair enough. > I had a feeling this statement was incorrect (what zlin said about a-z) so I asked dalias in #bash who is knowledgeable about locale stuff:
<igli> dalias; have a q re: l10n. if we want to restrict allowed chars to ASCII (for config) without affecting user's LC_CTYPE, what's the best way? <dalias> igli, where do you want to restrict chars? <igli> example i've seen is in checking a var <dalias> checking a var to make sure it's all ascii? <igli> yeah <igli> [A-Za-z] isn't locale-safe is it? <dalias> match it to a regex <igli> hmm ok <dalias> it's safe if LC_COLLATE=C <igli> ah thanks :D <dalias> alternatively <dalias> the most safe way is just to write all chars explicitly <dalias> [ABCDEF...xyz] <igli> hehe <igli> ok nice one, i'll pass it on. <dalias> anyway posix guarantees that the collation order in C locale is ascii order <dalias> even if the host's character encoding is something idiotic like ebcdic The odd bit (for me) is that [A-Za-z] is affected by LC_COLLATE not LC_CTYPE, which makes sense if you think of it as a comparison. [[:class:]] is ofc CTYPE. So setting LC_COLLATE=C would appear to make sense, as long as you are never dealing with user filenames, but only the portage tree (which seems a bit yuck). A better general implementation might be a function that checks against the explicit list, since that would have no implications for file handling: isAlphaASCII() { for i; do [[ $i = *[^ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzxyz]* ]] \ && return 1 done return 0 } I /guess/ in sh the test line would be: case "$i" in *[!ABC..xyz]* ) return 1;; esac -- [EMAIL PROTECTED] mailing list