On 10/08/2014 11:53 AM, Ángel González wrote: > Eric Blake wrote: >> On 10/08/2014 08:52 AM, Stephane Chazelas wrote: >>> When bash parses code it honours the "blank" character class in >>> the current locale as token separator. >>> >>> For instance, if "x" is a blank character in the current locale, >> >> Such a locale is invalid per POSIX; but the invalidity of the locale >> doesn't stop it from being a potential attack vector :) > > > Is it? I looked at locale definition [1] but it only seems to define > what the POSIX/C locale must be, not any restriction on what a locale > could impose. It seems to me that a Klingon locale where everything > outside U+F8D0 - U+F8FF [2] were considered a blank would be conformant > (although an Earth application using such locale would hit a lot of > undefined cases ☺). > > 1- http://pubs.opengroup.org/onlinepubs/7908799/xbd/locale.html > 2- http://www.evertype.com/standards/csur/klingon.html
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap06.html POSIX requires that ALL locales support the portable filename set of characters as single-byte seven-bit characters (many of those characters can ALSO occur as a byte within a multibyte character, as in Big5 encoding; at least UTF-8 is a sane encoding where no single-byte character occurs embedded in a multibyte character). For some characters ('.', '/', '\n', '\r') the encoding MUST be invariant across all supported locales. It has a little bit of fuzz room by stating that if two locales choose a different encoding for other characters, then results are unspecified when crossing those locale boundaries; but in all reality, the ONLY widely-used encoding that does not have the same bytes as ASCII is EBCDIC, and it already picks different values for encoding '.' and '/', so it is impossible to have a POSIX-compliant system that simultaneously supports ASCII and EBCDIC locales. At any rate, I read the requirements on the portable filename set as requiring that ALL locales define 'x' as a 7-bit character, and I'm not seeing enough flexibility in that to define a locale that puts 'x' in the blank class. On the other hand, a locale that abuses 8-bit characters to be blanks in some locales and letters in others is indeed quite possible and compliant; so while using 'x' as a blank is questionable, the whole idea of abusing locales to cause parse differences is not. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature