[
https://jira.codehaus.org/browse/SUREFIRE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=362273#comment-362273
]
Andreas Gudian commented on SUREFIRE-1137:
------------------------------------------
I should have answered yesterday night already: I was able to reproduce the
problem on my local Windows machine by encoding the test java file as UTF-8 and
using UTF-8 in the pom. Stacktraces and error messages are correctly encoded in
the output XML, but the sysout doesn't survive the journey, just as Jürgen
describes.
My main maven process has {{Charset.defaultCharset()}} being my windows-1252,
whereas the forked VM has {{Charset.defaultCharset()}} UTF-8. The current
implementation relies on the default charset being the same on both the main
process and the forked process, hence the encoding garbage.
* if I don't pass file.encoding to the forked VM, then the forked VM also uses
windows-1252
* If I pass -Dfile.encoding=UTF-8 in the MAVEN_OPTS to the main process, then
System.getProperty("file.encoding") is "UTF-8", but
{{Charset.defaultCharset()}} _remains being windows-1252_ - I was not able to
manipulate the defaultCharset of the main process with a system property.
But the documentation is quite clear on that: you're not supposed to change the
defaultCharset by using file.encoding, but instead change the system's locale /
language settings. Meh.
I'm not really sure yet what to make of this. I could pass the fork's
defaultCharset back to the main process to properly recode the stream into
UTF-8. I could pass the main's defaultCharset to the fork to use that one for
encoding the String in PrintSteam's print(String) method (although that may
cause strange side-effects with other ways how to use that print stream). Or I
could convert any print stream activity in the fork to UTF-16 (although not
every charset can transform all its characters to UTF-16 and then again back
from UTF-16, which is why I tried to rely on the defaultEncoding in the first
place)...
So I might go with the first option, but I still need to think about it (to see
if it really is the right thing to do).
If you guys have an idea here, let me know.
> Problem with Umlauts in stdout
> ------------------------------
>
> Key: SUREFIRE-1137
> URL: https://jira.codehaus.org/browse/SUREFIRE-1137
> Project: Maven Surefire
> Issue Type: Bug
> Components: Maven Surefire Plugin
> Affects Versions: 2.18
> Environment: Linux
> Reporter: Jürgen Zeller
> Assignee: Andreas Gudian
> Attachments: surefire-test.zip
>
>
> When using Cp1252 as file encoding, the generated Surefire stdout report
> contains invalid characters when run on Linux. When running the same test on
> Windows, everything is fine.
> A simular Problem was reported in SUREFIRE-998
--
This message was sent by Atlassian JIRA
(v6.1.6#6162)