Hi all,

> -----Original Message-----
> From: kpreis...@apache.org [mailto:kpreis...@apache.org]
> Sent: Tuesday, September 24, 2013 9:11 PM

> --- tomcat/site/trunk/xdocs/whoweare.xml (original)
> +++ tomcat/site/trunk/xdocs/whoweare.xml Tue Sep 24 19:10:44 2013
> @@ -100,6 +100,9 @@ A complete list of all the Apache Commit
>  <p><b>Costin Manolache</b> (costin at apache.org)<br/></p>
>  <!--Your bio goes here-->
> 
> +<p><b>Konstantin Preißer</b> (kpreisser at apache.org)<br/></p>

When editing the whoweare.xml, I wrote the "ß" character (sharp s) which is now 
displayed as "ß" in the commit message, because the source XML file is encoded 
in UTF-8 (the default encoding for XML files).

As far as I understand, SVN needs to treat changes in text files at byte-level, 
not at character-level, to be independent from character encodings. Therefore 
e.g. ".patch" files don't have a character encoding as they describe changes at 
byte-level.

However, when the Commit E-Mail is sent, the bytes need to be converted to 
characters, and it seems the SVN commit diff is interpreted as ISO-8859-1 (or 
Windows-1252). Therefore, the UTF-8 bytes 0xC3 0x9F are displayed as "ß", 
instead of "ß".

That would be the preferred way to handle such issues? One way I can think 
would be to XML-encode such characters ("ß" as "&#xDF;"). However, personally I 
would rather not do this, but write such characters directly ("ß"), so that the 
source is better readable (and encodings like UTF-8 guarantee that the 
characters are interpreted the same on each system, independently from the 
system language or geographic location).

Could it be possible to change SVN Commit E-Mail system so that it may 
interpret diffs as UTF-8 instead of ISO-8859-1 (assuming all files which 
contain bytes > 0x7F are encoded as UTF-8)? (Or, that it tries to decode it as 
UTF-8, and if it fails, decode it as ISO-8859-1 ?)

For example, when I use TortoiseSVN to view the unified diff of r152597, then 
it prints the "ß" character, so it seems to interpret it as UTF-8.

Can you give me a hint?

Thanks!

Kind regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to