[
https://issues.apache.org/jira/browse/COMPRESS-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18069230#comment-18069230
]
Uwe Voigt commented on COMPRESS-722:
------------------------------------
I have also looked at it and have a working attempt.
The already mentioned optional setting {{autoCompress}} is provided to
overloaded constructors in ZipArchiveOutputStream. If it is true, the existing
specific compressors for BZIP2, XZ and ZSTD are registered. It is also possible
to register additional compressors for additional methods if needed.
In {{putArchiveEntry}} the appropriate compressor stream is created (factory
mechanism) and used to compress the data in {{{}write{}}}.
Furthermore, I call {{closeArchiveEntry}} in {{finish}} because this
corresponds to the procedure of {{{}ZipOutStream{}}}.
This can break existing user code. However, I have no idea why an exception
should be expected when closing a stream with an unclosed entry.
> ZStandard compression does not work correctly in ZipArchiveOutputStream
> -----------------------------------------------------------------------
>
> Key: COMPRESS-722
> URL: https://issues.apache.org/jira/browse/COMPRESS-722
> Project: Commons Compress
> Issue Type: Bug
> Components: Archivers, Compressors
> Affects Versions: 1.28.0
> Reporter: Uwe Voigt
> Priority: Major
>
> Hi,
>
> When writing a using ZSD compression this leads to corrupt archive.
> The expectation is (correct me if I am wrong) that the following code should
> produce functional archive:
>
> {code:java}
> @Test
> public void doSave() throws Exception {
> try (ZipArchiveOutputStream out = new ZipArchiveOutputStream(new
> File("test-zstd.zip"))) {
> ZipArchiveEntry e1 = new ZipArchiveEntry("one.txt");
> e1.setMethod(ZipMethod.ZSTD.getCode());
> out.putArchiveEntry(e1);
> out.write("""
> text file
> with multiple line
> """.getBytes());
> out.closeArchiveEntry();
> ZipArchiveEntry e2 = new ZipArchiveEntry("two.txt");
> e2.setMethod(ZipMethod.ZSTD.getCode());
> out.putArchiveEntry(e2);
> out.write("""
> another text file
> some more line
> """.getBytes());
> out.closeArchiveEntry();
> }
> org.apache.commons.compress.archivers.zip.ZipFile zipFile =
> org.apache.commons.compress.archivers.zip.ZipFile.builder().setFile("test-zstd.zip").get();
> Enumeration<ZipArchiveEntry> entries = zipFile.getEntries();
> ZipArchiveEntry lastEntry = null;
> while (entries.hasMoreElements()) {
> ZipArchiveEntry entry = entries.nextElement();
> System.out.println(entry.getName() + " / " + entry.getSize() + " / " +
> entry.getCompressedSize());
> if (!entries.hasMoreElements())
> lastEntry = entry;
> }
> try (InputStream in = new
> ZstdCompressorInputStream(zipFile.getInputStream(lastEntry))) {
> System.out.println(new String(in.readAllBytes()));
> }
> }
> {code}
> Instead, an exception is thrown:
>
> {{}}
> {code:java}
> com.github.luben.zstd.ZstdIOException: Truncated source at
> com.github.luben.zstd.ZstdInputStreamNoFinalizer.readInternal(ZstdInputStreamNoFinalizer.java:174)
> at
> com.github.luben.zstd.ZstdInputStreamNoFinalizer.read(ZstdInputStreamNoFinalizer.java:136)
> at com.github.luben.zstd.ZstdInputStream.read(ZstdInputStream.java:103) at
> org.apache.commons.compress.compressors.zstandard.ZstdCompressorInputStream.read(ZstdCompressorInputStream.java:119)
> at java.base/java.io.InputStream.readNBytes(InputStream.java:412) at
> java.base/java.io.InputStream.readAllBytes(InputStream.java:349) at
> zipeditor.Zip64Test.doSave(Zip64Test.java:123) ...{code}
> The only working code would be this
>
>
> {code:java}
> publicstatic OutputStream noClose(OutputStream out) {
> return new OutputStream() {
> @Override
> public void write(int b) throws IOException {
> out.write(b);
> }
>
> @Override
> public void write(byte[] b, int off, int len) throws IOException {
> out.write(b, off, len);
> }
>
> @Override
> public void flush() throws IOException {
> out.flush();
> }
>
> @Override
> public void close() throws IOException {
> // do nothing
> }
> };
>
> }
>
> @Test
> public void doSave() throws Exception {
> try (ZipArchiveOutputStream out = new ZipArchiveOutputStream(new
> File("test-zstd.zip"))) {
> ZipArchiveEntry e1 = new ZipArchiveEntry("one.txt");
> e1.setMethod(ZipMethod.ZSTD.getCode());
> byte[] bytes = """
> text file
> with multiple line
> """.getBytes();
> e1.setSize(bytes.length);
> out.putArchiveEntry(e1);
> try (OutputStream zstdOutput =
> ZstdCompressorOutputStream.builder().setOutputStream(noClose(out)).get()) {
> zstdOutput.write(bytes);
> zstdOutput.flush();
> }
> out.closeArchiveEntry();
> ZipArchiveEntry e2 = new ZipArchiveEntry("two.txt");
> e2.setMethod(ZipMethod.ZSTD.getCode());
> String second = """
> another text file
> some more line
> """;
> e2.setSize(second.getBytes().length);
> out.putArchiveEntry(e2);
> try (OutputStream o2 =
> ZstdCompressorOutputStream.builder().setOutputStream(noClose(out)).get()) {
> o2.write(second.getBytes());
> o2.flush();
> }
> out.closeArchiveEntry();
> }
>
> boolean jni = true;
> org.apache.commons.compress.archivers.zip.ZipFile zipFile =
> org.apache.commons.compress.archivers.zip.ZipFile
> .builder().setZstdInputStreamFactory((inpStream) -> jni ? new
> ZstdInputStreamNoFinalizer(inpStream) : new
> io.airlift.compress.zstd.ZstdInputStream(inpStream))
> .setFile("test-zstd.zip").get();
> Enumeration<ZipArchiveEntry> entries = zipFile.getEntries();
> ZipArchiveEntry lastEntry = null;
> while (entries.hasMoreElements()) {
> ZipArchiveEntry entry = entries.nextElement();
> System.out.println(entry.getName() + " / " + entry.getSize() + " / " +
> entry.getCompressedSize());
> if (!entries.hasMoreElements())
> lastEntry = entry;
> }
> InputStream in = zipFile.getInputStream(lastEntry);
> System.out.println(new String(in.readAllBytes()));
>
> zipFile.close();
> }
> {code}
>
>
>
>
> {color:#000000}I think it should not be left to the caller of the API to
> create such a lot of code just to workaround something that should be done by
> the library.{color}
>
> {color:#000000}Can you please investigate and hopefully fix that
> behavior.{color}
>
> {color:#000000}Best regards,{color}
> {color:#000000}Uwe{color}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)