[ https://issues.apache.org/jira/browse/LUCENE-8739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17322015#comment-17322015 ]
Adrien Grand commented on LUCENE-8739: -------------------------------------- I forgot to update this issue but I actually played with ZSTD a few months ago using JNA. I have an dirty ugly untested branch at https://github.com/jpountz/lucene-solr/tree/zstd if you are curious. The results were good but not as appealing as benchmarks that work on whole files. It seems to me that most of the compression gains of ZSTD compared to Deflate come from the larger sliding window that it uses at compression time (Deflate can only deduplicate strings that occur within 30kB of each other). But given how Lucene splits stored fields into small-ish blocks anyway in order to keep decompression fast, ZSTD didn't yield much smaller indexes. Regarding compression/decompression speed, ZSTD did perform better than vanilla DEFLATE, but most of this gap can actually be filled by using a DEFLATE variant that vectorizes the slowest bits like Cloudflare's DEFLATE, which can be done on the default codec by putting the other DEFLATE variant on the LD_LIBRARY_PATH. > ZSTD Compressor support in Lucene > --------------------------------- > > Key: LUCENE-8739 > URL: https://issues.apache.org/jira/browse/LUCENE-8739 > Project: Lucene - Core > Issue Type: New Feature > Components: core/codecs > Reporter: Sean Torres > Priority: Minor > Labels: features > > ZStandard has a great speed and compression ratio tradeoff. > ZStandard is open source compression from Facebook. > More about ZSTD > [https://github.com/facebook/zstd] > [https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org