[ 
https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339052#comment-17339052
 ] 

Adrien Grand commented on LUCENE-9843:
--------------------------------------

The patch looks good. This makes me wonder whether we should remove the 
threshold that only enables compression on the terms dict for non-tiny 
dictionaries: I believe that it hurts test coverage since our tests rarely 
index many documents, yet I'm not sure whether it brings real benefits to our 
users: iterating the terms dict is going to be super fast anyway if you only 
have few terms?

> Remove compression option on doc values
> ---------------------------------------
>
>                 Key: LUCENE-9843
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9843
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Blocker
>         Attachments: LUCENE-9843.patch
>
>
> Options on file formats add complexity and put a big tax on 
> backward-compatibility testing. I'm the one who introduced it LUCENE-9378 but 
> I would now like to think about what we can do to remove this option.
> For the record, compression was initially introduced because some binary 
> fields have so much redundancy that it's wasteful not to compress them at 
> all. But unfortunately, this slowed down some search workloads and we decided 
> to introduce this option as a way to let users choose the trade-off they want.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to