[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321327#comment-17321327 ]
Robert Muir commented on LUCENE-9843: ------------------------------------- I moved this issue to a blocker for 9.0 because i've already seen multiple instances where these compression settings are set inappropriately, and from a back-compat perspective we need to stop the bleeding before we have to support all these variants for a long time. I'll summarize my proposal above again: * remove the option for SORTED term dictionaries, just compress always. does not impact speed of per-doc ordinals. * remove the option for BINARY, don't compress. it is a catch-all and we don't know the use-case. Supply a different codec if someone wants to do block compression over binary, but avoid back compat hassle. Seems the issue could be easily split into two tasks. > Remove compression option on doc values > --------------------------------------- > > Key: LUCENE-9843 > URL: https://issues.apache.org/jira/browse/LUCENE-9843 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Blocker > > Options on file formats add complexity and put a big tax on > backward-compatibility testing. I'm the one who introduced it LUCENE-9378 but > I would now like to think about what we can do to remove this option. > For the record, compression was initially introduced because some binary > fields have so much redundancy that it's wasteful not to compress them at > all. But unfortunately, this slowed down some search workloads and we decided > to introduce this option as a way to let users choose the trade-off they want. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org