> -----Original Message----- > From: Daniel Shahaf [mailto:d...@daniel.shahaf.name] > Sent: 22 February 2011 09:34 > To: Johan Corveleyn > Cc: Thomas STEININGER; Stephen Connolly; users@subversion.apache.org > Subject: Re: Re: Antwort: Re: problem with mutated vowel in > log-message-contents > > Daniel Shahaf wrote on Tue, Feb 22, 2011 at 11:26:25 +0200: > > Johan Corveleyn wrote on Tue, Feb 22, 2011 at 09:43:25 +0100: > > > So, all that being said, what Daniel means is that you > could apply > > > something like: > > > > > > svn propedit --revprop -r $REV --editor-cmd 'perl -pi -e > > > "s/\\xfc/\\xc3\\xbc/g"' > > > > > > to all revisions (REV) that need to be corrected (either > a list that > > > you make up manually, or something automated with "svn propget > > > --revprop" combined with "sed", or something similar ...). > > > > By the way, please don't consider this a generic solution. It's a > > *shortcut*, which is probably okay for ü, but WILL corrupt your log > > messages if you adapt it for §. > > ... because the latin1 byte sequence for § is part of some > UTF-8 byte sequences.
Which is why you should probably use iconv(1) or any of the APIs listed here: http://www.unicodetools.com/ instead of dicking around with perl or sed and hard coded hand crafted single character mappings. There's potentially a lot more than just u-umlaut to worry about. Tony. > > (I'm assuming that at least some log messages are already in UTF-8.) > > ______________________________________________________________________ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit > http://www.messagelabs.com/email > ______________________________________________________________________ >