[
https://jira.codehaus.org/browse/MASSEMBLY-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=312643#comment-312643
]
Dennis Lundberg commented on MASSEMBLY-371:
-------------------------------------------
More fixes in
[r1403897|http://svn.apache.org/viewvc?view=revision&revision=1403897].
> Converting line endings corrupts ISO-8859-1 files when platform encoding is
> UTF-8
> ---------------------------------------------------------------------------------
>
> Key: MASSEMBLY-371
> URL: https://jira.codehaus.org/browse/MASSEMBLY-371
> Project: Maven 2.x Assembly Plugin
> Issue Type: Bug
> Affects Versions: 2.2-beta-2, 2.2
> Environment: Linux with platform encoding set to UTF-8
> Reporter: Håvard Wigtil
> Assignee: Dennis Lundberg
> Fix For: 2.4
>
> Attachments: assembly-encoding.zip
>
>
> Converting line endings for a text file encoded in ISO-8859-1 replaces any
> character in the set above ASCII with the three characters ï¿.
> What happens is that the file to be converted is read as text in the platform
> encoding (seems to be method readFile in class FileFormatter), and when the
> platform encoding is UTF-8, any non-ASCII character from ISO-8859-1 is
> converted to the UTF-8 character "�" (i.e. the placeholder for unknown
> / broken character).
> I've attached a small sample project that shows this problem on Linux with
> platform encoding set to UTF-8.
> I see two possible fixes for this, one is to read the file as bytes and do a
> search /replace for line endings, and the other is to be able to specify
> encoding for a fileset or file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://jira.codehaus.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira