[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302154#comment-17302154 ]
Robert Muir commented on LUCENE-9843: ------------------------------------- +1 There is a more obvious one to fix immediately: {{SORTED}}. Why is the codec option available on {{SORTED}} terms dictionary? The option is not necessary: it does not impact the speed of per-document ordinals. And the term dictionary (for lookupOrd) is block-compressed, prefix coded, etc regardless of what you supply. So let's please remove the option there. For the {{BINARY}}, I personally think it is wrong to compress by default, in the default codec. The user wants a per-document byte[] (with their custom encoding), we should make it fast and just plumb it through. It's like a catch-all type when no other type (numeric, string, etc) is truly suitable. Sure, maybe some users are putting "yuge" stuff in there, where compression might not hurt their speed and save some disk: we could supply a different codec in the {{codecs/}} package for such users. But I don't think it makes sense at all to support in the default codec with backwards compatibility. > Remove compression option on doc values > --------------------------------------- > > Key: LUCENE-9843 > URL: https://issues.apache.org/jira/browse/LUCENE-9843 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > > Options on file formats add complexity and put a big tax on > backward-compatibility testing. I'm the one who introduced it LUCENE-9378 but > I would now like to think about what we can do to remove this option. > For the record, compression was initially introduced because some binary > fields have so much redundancy that it's wasteful not to compress them at > all. But unfortunately, this slowed down some search workloads and we decided > to introduce this option as a way to let users choose the trade-off they want. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org