[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338626#comment-17338626 ]
Jack Conradson commented on LUCENE-9843: ---------------------------------------- I have attached a patch ([^LUCENE-9843.patch]) that attempts to address this issue with the suggestions given by [~rcmuir] . The patch does the following: * Removes the best_speed/best_compression mode for just doc values from the Lucene90Codec * Terms dictionaries now always use compression unless the values are below the {color:#9876aa}TERMS_DICT_BLOCK_COMPRESSION_THRESHOLD{color} * Binary fields now never use compression * Consolidated many tests into TestLucene90DocValuesFormat as there is no longer a need for separate tests for the different options * Removed tests that relied on both best_speed/best_compression for comparisons against each other > Remove compression option on doc values > --------------------------------------- > > Key: LUCENE-9843 > URL: https://issues.apache.org/jira/browse/LUCENE-9843 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Blocker > Attachments: LUCENE-9843.patch > > > Options on file formats add complexity and put a big tax on > backward-compatibility testing. I'm the one who introduced it LUCENE-9378 but > I would now like to think about what we can do to remove this option. > For the record, compression was initially introduced because some binary > fields have so much redundancy that it's wasteful not to compress them at > all. But unfortunately, this slowed down some search workloads and we decided > to introduce this option as a way to let users choose the trade-off they want. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org