jpountz commented on PR #14203:
URL: https://github.com/apache/lucene/pull/14203#issuecomment-2724038977

   I have some small concerns:
    - The fact that the 512 step is tied to the number of points per leaf, 
though it's not a big deal at all, postings are similar: their encoding logic 
is specialized for blocks of 128. I guess I'd just rather err on a smaller 
block size than 512, which feels larg-ish.
    - Complexity: the encoding has 3 different sub encodings: 512, 128 and 
remainder. Could we have only two?
   
   But my main concern is more that I would like to better understand why 512 
performs so much better. There must be something that happens with this 512 
step that doesn't happen otherwise such as using different instructions, loop 
unrolling, better CPU pipelining or something else. I have some discomfort 
merging something that is faster without having at least an intuition of why 
it's faster, so that I can also understand which JVMs and CPUs would enable 
this speedup. Could pipelining be the reason as 24 (bits per value) * 32 (step) 
< 2 * 512 (bit width of SIMD instructions)? But then something like 128 should 
perform well while your benchmark suggests it's still much worse than 512?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to