El 02/10/15 a las 13:01, Paul Eggert escribió:
> On 10/02/2015 02:43 AM, Santiago Ruano Rincón wrote:
> >grep doesn't match characters with diacritical
> >marks in ISO-8859 files, inside a Unicode enviroment
> 
> That is normal and expected behavior.  In a UTF-8 locale, "á" is represented
> by the two bytes 0xC3 and 0xA1.  In an ISO-8859 file, the same character is
> represented by the single byte 0xE1.  The UTF-8 pattern won't match the
> ISO-8859 representation.
> 
> To avoid this problem, switch to an ISO-8859 locale before using grep to
> read ISO-8859 text files.  This is true for pretty much any standard
> utility, not just grep.  Alternatively, you can translate the text files
> from ISO-8859 to UTF-8, before giving the resulting text to grep or to other
> utilities.

Last changes also fix this:

With 2.24:

% printf 'á' | LC_ALL=C grep .
Binary file (standard input) matches

Whit 2.24.13-bed6 pre-release

% printf 'á' | LC_ALL=C grep á
á

Thanks,

Santiago

Reply via email to