jpountz commented on pull request #1543: URL: https://github.com/apache/lucene-solr/pull/1543#issuecomment-638908545
Hi @msokolov I'm looking into improving the encoding of lengths, which is the next bottleneck for binary doc values. We are using binary doc values to run regex queries on a high cardinality field. A ngram index helps find good candidates and binary doc values are then used for verification. Field values are typically files, URLs, ... which can have significant redundancy. I'm ok with making compression less aggressive though I think that it would be a pity to disable it entirely and never take advantage of redundancy. You mentioned slowdowns, but this actually depends on the query, e.g. I'm seeing an almost 2x speedup when sorting a `MatchAllDocsQuery` on `wikimedium10m`. Let me give a try at making compression less aggressive/slow? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org