[ 
https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302154#comment-17302154
 ] 

Robert Muir commented on LUCENE-9843:
-------------------------------------

+1

There is a more obvious one to fix immediately: {{SORTED}}. Why is the codec 
option available on {{SORTED}} terms dictionary? The option is not necessary: 
it does not impact the speed of per-document ordinals. And the term dictionary 
(for lookupOrd) is block-compressed, prefix coded, etc regardless of what you 
supply. So let's please remove the option there.

For the {{BINARY}}, I personally think it is wrong to compress by default, in 
the default codec. The user wants a per-document byte[] (with their custom 
encoding), we should make it fast and just plumb it through. It's like a 
catch-all type when no other type (numeric, string, etc) is truly suitable. 
Sure, maybe some users are putting "yuge" stuff in there, where compression 
might not hurt their speed and save some disk: we could supply a different 
codec in the {{codecs/}} package for such users. But I don't think it makes 
sense at all to support in the default codec with backwards compatibility.


> Remove compression option on doc values
> ---------------------------------------
>
>                 Key: LUCENE-9843
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9843
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>
> Options on file formats add complexity and put a big tax on 
> backward-compatibility testing. I'm the one who introduced it LUCENE-9378 but 
> I would now like to think about what we can do to remove this option.
> For the record, compression was initially introduced because some binary 
> fields have so much redundancy that it's wasteful not to compress them at 
> all. But unfortunately, this slowed down some search workloads and we decided 
> to introduce this option as a way to let users choose the trade-off they want.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to