[ https://jira.codehaus.org/browse/MASSEMBLY-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=361688#comment-361688 ]
Hannes Kogler edited comment on MASSEMBLY-748 at 1/23/15 2:59 AM: ------------------------------------------------------------------ But I cannot understand why the maven-assembly-plugin here should behave different from the default OS-behavior? Indeed Windows can handle umlauts also in zip compressed file names perfectly, if you just compress the same file(s) with the Windows compression tool itself. So if you take the buildinf.playground project (from my sample project), put it to windows desktop, rightclick on it and say "send to/Compressed (Zipped) folder" the compression works fine and also opening the zip file with Windows reads out the correct file name. So in my point of view the maven-assembly-plugin needs to behave just like you expect and know it from your OS. All other behaviors are just annoying and error-prone since you may loose the umlauts after extracting such a zip file on Windows currently. In other words there are two opportunities to solve this issue as I see it: * The maven-assembly-plugin should just use the OS-encoding. (like it did before your "switch over to UTF-8"?!) * The other alternative would be - just like the <archiverConfig> tag already provides but only for the way of compression - the default encoding can surely be UTF-8, but you need to have the possibility to configure an alternative encoding also for way of extraction, to work consistently. ps.: just execute a mvn clean install on the _buildinf.module_ using the Maven profile "otherWay". then you should be able to reproduce that because this profile uses the way of skipping the <archiverConfig> stanza. In this way you are able to decrompess it with the assembly-plugin, but you cannot really use the produced zip file [of the project buildinf.playground] on windows systems, because if you just extract it there you loose the umlauts forever. was (Author: ntshko): But I cannot understand why the maven-assembly-plugin here should behave different from the default OS-behavior? Indeed Windows can handle umlauts also in zip compressed file names perfectly, if you just compress the same file(s) with the Windows compression tool itself. So if you take the buildinf.playground project (from my sample project), put it to windows desktop, rightclick on it and say "send to/Compressed (Zipped) folder" the compression works fine and also opening the zip file with Windows reads out the correct file name. So in my point of view the maven-assembly-plugin needs to behave just like you expect and know it from your OS. All other behaviors are just annoying and error-prone since you may loose the umlauts after extracting such a zip file on Windows currently. In other words there are two opportunities to solve this issue: * The maven-assembly-plugin should just use the OS-encoding. (like it did before your "switch over to UTF-8"?!) * The other alternative would be - just like the <archiverConfig> tag already provides but only for the way of compression - the default encoding can surely be UTF-8, but you need to have the possibility to configure an alternative encoding also for way of extraction, to work consistently. ps.: just execute a mvn clean install on the buildinf.module using the Maven profile "otherWay". then you should be able to reproduce that if you use this way of skipping the <archiverConfig> stanza you are able to decrompess it with the assembly-plugin, but you cannot really use the produced zip file [of the project buildinf.playground] on windows systems, because if you just extract it there you loose the umlauts forever. > problem to extract zip files including file names with umlauts > -------------------------------------------------------------- > > Key: MASSEMBLY-748 > URL: https://jira.codehaus.org/browse/MASSEMBLY-748 > Project: Maven Assembly Plugin > Issue Type: Bug > Components: maven-archiver > Affects Versions: 2.5.3 > Environment: > Reporter: Hannes Kogler > Assignee: Kristian Rosenvold > Fix For: 2.5.4 > > Attachments: encoding_problem_on_zip_extract.7z > > > Like in an other issue reported, you need to explicitly set the code page > CP850 to create zip packages hosting file names with correct umlauts their > names. (by using the following configuration) > <archiverConfig> > <encoding>CP850</encoding> > </archiverConfig> > After all this solution is not 100% useful, because if you extract this file > with the obiously correct umlauts in the zip, wrong chars for all umlauts > reappear. > It's strange, because if you unzip this zip file with all other zip tools > (7zip, Windows native zip support aso.) the extraction works fine. > Only using the maven-assembly-plugin the umlauts get corrupted. > (a try to set the archiverConfig with the CP850 also for the extracting > execution process of the assembly plugin just results in a bad error calling > Failed to configure archiver: > " org.codehaus.plexus.archiver.dir.DirectoryArchiver: Cannot find 'encoding' > in class org.codehaus.plexus.archiver.dir.DirectoryArchiver " ) -- This message was sent by Atlassian JIRA (v6.1.6#6162)