[ https://issues.apache.org/jira/browse/LUCENE-9919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319623#comment-17319623 ]
Michael McCandless commented on LUCENE-9919: -------------------------------------------- How about just building a custom {{Codec}} using pure Java ZSTD implementation, instead of JDK DEFLATE, and then run our standard {{luceneutil}} benchmarks? This is a great example use-case of why we built the {{Codec}} API in the first place – to enable fun experimentation exactly like this! > ZSTD Compressor/Decompressor support in Lucene > ---------------------------------------------- > > Key: LUCENE-9919 > URL: https://issues.apache.org/jira/browse/LUCENE-9919 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs > Reporter: Praveen Nishchal > Priority: Major > Labels: compression, lucene, zstandard > > Lucene currently supports LZ4 and Zlib compression/decompression for > StoredFieldsFormat, DocValuesFormat, TermVectorsFormat and PostingsFormat > codecs. We propose Zstandard ([https://facebook.github.io/zstd/]) > compression/decompression for all codecs mentioned earlier for following > reasons: > * ZStandard is being used in some of the most popular open source projects > like Apache Cassandra, Hadoop and Kafka. > * Zstandard, at the default setting of 3, is expected to show substantial > improvements in both compression and decompression speed, while compressing > at the same ratio as zlib as per study mentioned by Yann Collet at Facebook. > * Zstandard currently offers 22 different Compression levels, which enable > flexible, granular trade-offs between compression speed and ratios for future > data. For example, we can use level 1 if speed is most important and level 22 > if size is most important. > * Zstandard designed to scale with modern hardware. > * Small data > - It has APIs for dictionary compression as well. Small data > compression can range anywhere from 2x to 5x better than compression without > dictionaries. > * Zstandard is being continuously improved by Facebook/Community. > > Kindly go through below link for more details: > [https://engineering.fb.com/2016/08/31/core-data/smaller-and-faster-data-compression-with-zstandard/] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org