gandhi-viral commented on pull request #1543:
URL: https://github.com/apache/lucene-solr/pull/1543#issuecomment-642889243


   Red-line QPS (throughput) based on our internal benchmarking is still 
unfortunately suffering (-49%) with the latest PR.
   
   We were able to isolate one particular field, a ~90 byte on average metadata 
field, which is causing most of our regression. After disabling compression on 
that particular field, we are at -8% red-line QPS compared to using Lucene 8.4 
BDVs. Looking further into the access pattern for that field, we see that 
(num_access / num_blocks_decompressed = 1.51), so we are decompressing a whole 
block per every ~1.5 hits.
   
   By temporarily using `BINARY_LENGTH_COMPRESSION_THRESHOLD = 10000` to 
effectively disable the LZ4 compression, we are at -2% red-line QPS, which we 
could live with. Could we maybe add an option to the 
`Lucene80DocValuesConsumer` constructor to disable compression for 
BinaryDocValues, or to control the 32 byte threshold?  We could enable this 
compression by default, since it’s clearly helpful in many cases from the 
`luceneutil` benchmarks, but let expert users create their custom Codec to 
control it.
   
   Thank you @jpountz for your help. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to