MXParser can't handle the encoding declaration in XML declaration 
------------------------------------------------------------------

         Key: MNG-2148
         URL: http://jira.codehaus.org/browse/MNG-2148
     Project: Maven 2
        Type: Bug

  Components: POM  
    Reporter: Naoki Nose
 Attachments: src.jar

The xml pull parser in plexus-utils(MXParser.java) can't handle the encoding 
declaration in XML declaration.
So, it's impossible to use an encoding different from system default encoding. 
This is critical in Japan, because 
there is two commonly used encodings in Japanese environment(SJIS and EUC-JP).

I think MXParser should handle encoding declaration in xml as described in w3c 
specification/
http://www.w3.org/TR/REC-xml/#sec-guessing

I tried to fix this problem(see attachment).
I changed the setInput(InputStream) method to detect encoding in xml 
declaration.
For writing this code, I referred to source code of Apache Xerces.
UCS-4 and UCS-2 isn't supported in this implementation, because
these encoding isn't supported by Sun JDK.

Xerces solves this problem by providing original reader for these encodings. I 
think Xerces's solution is
too complex for plexus-utils.

To solve this issue, it's not sufficient only to change plexus-utils, because
DefaultMavenProjectBuilder reads POM by FileReader without specifying encoding.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to