[
https://issues.apache.org/jira/browse/HADOOP-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725900#comment-13725900
]
Brandon Li commented on HADOOP-9801:
------------------------------------
The patch looks good.
For the unit test(testMultiByteCharacters) to test the issue even when running
on Linux, we may want to set the default character to non-utf8 in the test.
> Configuration#writeXml uses platform defaulting encoding, which may mishandle
> multi-byte characters.
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-9801
> URL: https://issues.apache.org/jira/browse/HADOOP-9801
> Project: Hadoop Common
> Issue Type: Bug
> Components: conf
> Affects Versions: 3.0.0, 1-win, 1.3.0, 2.1.1-beta
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Attachments: HADOOP-9801-branch-1.1.patch, HADOOP-9801-trunk.1.patch
>
>
> The overload of {{Configuration#writeXml}} that accepts an {{OutputStream}}
> does not set encoding explicitly, so it chooses the platform default
> encoding. Depending on the platform's default encoding, this can cause
> incorrect output data when encoding multi-byte characters.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira