jpountz commented on pull request #1543:
URL: https://github.com/apache/lucene-solr/pull/1543#issuecomment-638908545


   Hi @msokolov I'm looking into improving the encoding of lengths, which is 
the next bottleneck for binary doc values. We are using binary doc values to 
run regex queries on a high cardinality field. A ngram index helps find good 
candidates and binary doc values are then used for verification. Field values 
are typically files, URLs, ... which can have significant redundancy. I'm ok 
with making compression less aggressive though I think that it would be a pity 
to disable it entirely and never take advantage of redundancy. You mentioned 
slowdowns, but this actually depends on the query, e.g. I'm seeing an almost 2x 
speedup when sorting a `MatchAllDocsQuery` on `wikimedium10m`. Let me give a 
try at making compression less aggressive/slow?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to