On 26 September 2013 23:29, Konstantin Kolinko <knst.koli...@gmail.com> wrote: > 2013/9/26 sebb <seb...@gmail.com>: >> On 25 September 2013 17:02, Konstantin Preißer <kpreis...@apache.org> wrote: >>> Mark, >>> >>>> -----Original Message----- >>>> From: Mark Thomas [mailto:ma...@apache.org] >>>> Sent: Wednesday, September 25, 2013 5:54 PM >>> >>>> I'd say yes. Property files are a 'special' case: >>>> http://stackoverflow.com/questions/4659929/how-to-use-utf-8-in- >>>> resource-properties-with-resourcebundle >>> >>> OK, thank you for the clarification. >>> >>>> It doesn't bother me but I'm only one committer. I think this falls >>>> under the category if someone cares enough about the commit e-mails >>>> using UTF-8 then they need to work with infra to make that happen. I'm >>>> happy with things as they are. >> >> There is a property that can be used to change the encoding used by >> the SVN mailer, for example: >> >> svn:mime-type text/xml; charset=utf-8 >> >> Make sure this agrees with the contents and any xml encoding attribute. >> > > -1 for changing svn:mime-type in such a way. > Placing an encoding into svn:mime-type is wrong, as > a) It is not portable. (Git does not have svn properties).
There are other svn properties that are required, so that does not make sense. > b) It is hard to keep in sync. Beware that case may matter for some > software (UTF-8 vs utf-8). How often does the encoding change? > ( c) You may be relying on an undocumented feature. I remember some > long discussions several years ago on whether file encoding can be > part of svn:mime-type, or it should be a separate property, with no > clear outcome. See http://opensource.perlig.de/svnmailer/doc-1.0/#groups-charset-property > http://subversion.tigris.org/issues/show_bug.cgi?id=2329 > http://subversion.tigris.org/issues/show_bug.cgi?id=2194 > ) > > Regarding whoweare.xml file, you need to add explicit encoding to the > top of the file (like it is done in > tc7.0.x/trunk/webapps/docs/changelog.xml). Without that I consider > those files as ISO-8859-1, like the rest of our sources. The default for XML is UTF-8. > > I think commit mailer should treat the files as ISO-8859-1, as such XML is UTF-8 by default > interpretation does not lose any data and as that is the format of > unified diff. Not sure about those last two assertions. > In the past there were several cases when accented characters in > Tomcat's changelog files were corrupted during editing (due to a > conversion done in someone's editor). It was seen in commit message. > Last time it happened two or three years ago. That may be so, but I'm not sure what bearing that has on the svn commit message encoding. > http://svn.apache.org/r999983 > http://svn.apache.org/r1196769 > > As of now, several xml files in Tomcat (those changelogs) are > officially UTF-8, and I am OK with people using accented characters > for new text there until something breaks. > (Personally, I will probably still use numeric entities, as I do not > have those characters on my keyboard.) > > AFAIK, TortoiseSVN diff viewer has some logic to autodetect the use of UTF-8. > > Best regards, > Konstantin Kolinko > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org > For additional commands, e-mail: dev-h...@tomcat.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org