On Wednesday 10 August 2011 06:53:58 Darac Marjal wrote: > On Tue, Aug 09, 2011 at 01:24:46PM -0700, Mike McClain wrote: > > On Tue, Aug 09, 2011 at 12:42:18PM -0400, Eike Lantzsch wrote: > > > Hi: > > > > > > For some time I'm looking to find a method to remove unicode control > > > characters like U+202A; U+202C; U+200F from filenames. > > > I found lots of examples to do this programmatically with python, perl, > > > even for VB and Java. > > > I was looking to do this with bash, find, grep and/or even sed because > > > I just never wrote code in python or perl. > > > Can some kind soul please give me a hint how to proceed? > > If you've found a recipe in perl, I can recommend /usr/bin/rename (part > of the perl package and, on my system, a link to /usr/bin/prename). The > syntax is "rename regex filespec" so you can say "rename 's/foo/bar/ > bar.jpg". Maybe that'll help.
Thank you for the suggestion, but as far as I can see prename is not UTF-8- aware. Is that true? I'm right now studying http://en.wikibooks.org/wiki/Perl_Programming/Unicode_UTF-8 and http://www.perlmonks.org/?node_id=551676 and http://perldoc.perl.org/perlunicode.html if I enter myuser@mysytem:~/path-name-of-unicode-files$ rename -n 's/\x{202A}//' * I get no output although x{202A} is definitely the first char in the filename. This definitely needs more than a cursory view into perl - exactly what I wanted to avoid. Maybe I better post in a perl mailinglist. Only I'm afraid that I'll get nothing but RTFM! and "do your own homework!" - they maybe right ... Cheers Eike -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/201108101230.01965.zp6...@gmx.net