Bug#500501: More detailed analysis

2009-11-19 Thread Paolo Bonzini
From http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html "A period ( '.' ), when used outside a bracket expression, is a BRE that shall match any character in the supported character set except NUL." My point here is that current implementation of regexes makes '.' NOT mat

Bug#500501: More detailed analysis

2009-11-19 Thread Dmitri Gribenko
On Thu, Nov 19, 2009 at 5:18 PM, Paolo Bonzini wrote: > --binary is strictly for Windows support.  There is already one such > mode, it's called LANG=C. I didn't think about that. Thanks. > From http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html > > "A period ( '.' ), when u

Bug#500501: More detailed analysis

2009-11-19 Thread Paolo Bonzini
> In my opinion, the proper solution for sed would be: > 1. --binary option should throw sed in a true binary mode without any > knowledge of UTF-8 or any other multibyte encodings.  This would allow > to process binary files without any UTF-8 logic.  And this would allow > direct manipulation of i

Bug#500501: More detailed analysis

2009-11-19 Thread Dmitri Gribenko
Hi, Please see more detailed analysis in bug #555922 that I filed against libc (because in fact sed's regex implementation is based on, or is a copy of libc's one and this bug affects many more packages). In my opinion, the proper solution for sed would be: 1. --binary option should throw sed in