2013/9/26 sebb <seb...@gmail.com>: > On 25 September 2013 17:02, Konstantin Preißer <kpreis...@apache.org> wrote: >> Mark, >> >>> -----Original Message----- >>> From: Mark Thomas [mailto:ma...@apache.org] >>> Sent: Wednesday, September 25, 2013 5:54 PM >> >>> I'd say yes. Property files are a 'special' case: >>> http://stackoverflow.com/questions/4659929/how-to-use-utf-8-in- >>> resource-properties-with-resourcebundle >> >> OK, thank you for the clarification. >> >>> It doesn't bother me but I'm only one committer. I think this falls >>> under the category if someone cares enough about the commit e-mails >>> using UTF-8 then they need to work with infra to make that happen. I'm >>> happy with things as they are. > > There is a property that can be used to change the encoding used by > the SVN mailer, for example: > > svn:mime-type text/xml; charset=utf-8 > > Make sure this agrees with the contents and any xml encoding attribute. >
-1 for changing svn:mime-type in such a way. Placing an encoding into svn:mime-type is wrong, as a) It is not portable. (Git does not have svn properties). b) It is hard to keep in sync. Beware that case may matter for some software (UTF-8 vs utf-8). ( c) You may be relying on an undocumented feature. I remember some long discussions several years ago on whether file encoding can be part of svn:mime-type, or it should be a separate property, with no clear outcome. http://subversion.tigris.org/issues/show_bug.cgi?id=2329 http://subversion.tigris.org/issues/show_bug.cgi?id=2194 ) Regarding whoweare.xml file, you need to add explicit encoding to the top of the file (like it is done in tc7.0.x/trunk/webapps/docs/changelog.xml). Without that I consider those files as ISO-8859-1, like the rest of our sources. I think commit mailer should treat the files as ISO-8859-1, as such interpretation does not lose any data and as that is the format of unified diff. In the past there were several cases when accented characters in Tomcat's changelog files were corrupted during editing (due to a conversion done in someone's editor). It was seen in commit message. Last time it happened two or three years ago. http://svn.apache.org/r999983 http://svn.apache.org/r1196769 As of now, several xml files in Tomcat (those changelogs) are officially UTF-8, and I am OK with people using accented characters for new text there until something breaks. (Personally, I will probably still use numeric entities, as I do not have those characters on my keyboard.) AFAIK, TortoiseSVN diff viewer has some logic to autodetect the use of UTF-8. Best regards, Konstantin Kolinko --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org