Re: Grepping Unicode files?

2015-05-14 Thread Csaba Raduly
Hi Vince, On Thu, May 14, 2015 at 7:14 PM, Vince Rice wrote: > Oh my, the rabbit-hole gets deeper. I don't know the difference between wide > character and multi-byte. (snip) Maybe this will help: http://www.joelonsoftware.com/articles/Unicode.html Csaba -- GCS a+ e++ d- C++ ULS$ L+$ !E- W++

Re: Grepping Unicode files?

2015-05-14 Thread Eric Blake
to use surrogate pairs for characters over u+. On Linux, glibc sets wchar_t to 4 bytes, and prefers wide operations in UCS-4. >> grep cannot handle UTF16 natively. iconv exists to do encoding >> transformations, so that the rest of the system can live in multi-byte >> world inst

Re: Grepping Unicode files?

2015-05-14 Thread Vince Rice
iconv exists to do encoding > transformations, so that the rest of the system can live in multi-byte > world instead of worrying about wide-character encodings. … grep can’t handle unicode files. Good to know. iconv it is. Thanks again! -- Problem reports: http://cygwin.com/problems

Re: Grepping Unicode files?

2015-05-14 Thread Eric Blake
On 05/14/2015 10:32 AM, Vince Rice wrote: > locale run from a cmd.exe session says that everything is “C.UTF-8”, while > locale run from mintty says that everything is en_US.UTF-8. A “which” in both > cases shows that the locale being run is cygwin’s, so I assume mintty does > something slightl

Re: Grepping Unicode files?

2015-05-14 Thread Vince Rice
On May 14, 2015, at 10:56 AM, Andrey Repin wrote: > > Greetings, Vince Rice! > >> uname says "CYGWIN_NT-6.1 machinename 1.7.35(0.287/5/3) 2015-03-04 12:07 >> i686 Cygwin”. >> I’m running grep 2.21.2, which cygcheck -c says is OK. > >> Does Cygwin’s gr

RE: Grepping Unicode files?

2015-05-14 Thread Nellis, Kenneth
> Does Cygwin’s grep support Unicode files? The output from a SQL Server SQL > Agent job is a Unicode file, i.e. if you look at it in a hex editor every > other character is 00 because each character is taking up two bytes. The > filename itself is fine, it’s the contents that is Unic

Re: Grepping Unicode files?

2015-05-14 Thread Václav Haisman
On 14.5.2015 17:42, Vince Rice wrote: > uname says "CYGWIN_NT-6.1 machinename 1.7.35(0.287/5/3) 2015-03-04 > 12:07 i686 Cygwin”. I’m running grep 2.21.2, which cygcheck -c says > is OK. > > Does Cygwin’s grep support Unicode files? The output from a SQL > Server SQL Agen

Grepping Unicode files?

2015-05-14 Thread Vince Rice
uname says "CYGWIN_NT-6.1 machinename 1.7.35(0.287/5/3) 2015-03-04 12:07 i686 Cygwin”. I’m running grep 2.21.2, which cygcheck -c says is OK. Does Cygwin’s grep support Unicode files? The output from a SQL Server SQL Agent job is a Unicode file, i.e. if you look at it in a hex editor

Re: Unicode files

2003-01-12 Thread Jon LaBadie
At 21:18 2003-01-12, Clancy Malcolm wrote: >Can cygwin programs like grep process a unicode file? > >I have a Windows 2000 backup log file which seems to be a unicode file. >When I cat the file under cygwin it displays with spaces between every >second character: e.g. > >ÿ_B a c k u p S t a t u s

Re: Unicode files

2003-01-12 Thread Randall R Schulz
Clancy, Perl has some Unicode modules and Vim (both Cygwin and stand-alone) will edit Unicode files. Most (maybe all?) Gnu text tools are ASCII only. Apropos turns up the Perl modules plus something called "luit." Check it out. Perhaps it might be useful to you. Randall Schulz

Unicode files

2003-01-12 Thread Clancy Malcolm
Can cygwin programs like grep process a unicode file? I have a Windows 2000 backup log file which seems to be a unicode file. When I cat the file under cygwin it displays with spaces between every second character: e.g. ÿ_B a c k u p S t a t u s O p e r a t i o n : B a c k u p A c t i v e