[jira] Updated: (MASSEMBLY-371) Converting line endings corrupts ISO-8859-1 files when platform encoding is UTF-8

Dennis Lundberg (JIRA) Thu, 03 Feb 2011 13:50:52 -0800

     [ 
http://jira.codehaus.org/browse/MASSEMBLY-371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dennis Lundberg updated MASSEMBLY-371:
--------------------------------------

    Affects Version/s: 2.2

> Converting line endings corrupts ISO-8859-1 files when platform encoding is 
> UTF-8
> ---------------------------------------------------------------------------------
>
>                 Key: MASSEMBLY-371
>                 URL: http://jira.codehaus.org/browse/MASSEMBLY-371
>             Project: Maven 2.x Assembly Plugin
>          Issue Type: Bug
>    Affects Versions: 2.2-beta-2, 2.2
>         Environment: Linux with platform encoding set to UTF-8
>            Reporter: Håvard Wigtil
>         Attachments: assembly-encoding.zip
>
>
> Converting line endings for a text file encoded in ISO-8859-1 replaces any 
> character in the set above ASCII with the three characters ï¿.
> What happens is that the file to be converted is read as text in the platform 
> encoding (seems to be method readFile in class FileFormatter), and when the 
> platform encoding is UTF-8, any non-ASCII character from ISO-8859-1 is 
> converted to the UTF-8 character "&#65533;" (i.e. the placeholder for unknown 
> / broken character). 
> I've attached a small sample project that shows this problem on Linux with 
> platform encoding set to UTF-8.
> I see two possible fixes for this, one is to read the file as bytes and do a 
> search /replace for line endings, and the other is to be able to specify 
> encoding for a fileset or file.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MASSEMBLY-371) Converting line endings corrupts ISO-8859-1 files when platform encoding is UTF-8

Reply via email to