ChrisHegarty commented on PR #12417: URL: https://github.com/apache/lucene/pull/12417#issuecomment-1628920916
> Comparing them to the baseline, it shows that neither of them performs better than the baseline. This is a little surprising, and also disappointing. I had assumed, without verifying, that there was significant overhead in the bit packing. I'll have to spend some more time on the benchmark results. > I found that the bottleneck in performance is not in the encoding and decoding, but rather in the prefix sum calculation. Hmm.. @jpountz and I have been looking a little at this, see [vectorized-prefix-sum][1]. But there are issues there, with how the Vector API handles cross-lane moves - they are not optimal. I've yet to get this feedback to OpenJDK. I think we should step back a little here, and get a better handle on exactly where the most significant bottleneck is. @tang-hi you say it is in prefix sum. How are you determining that? Some background on the current implementation: [LUCENE-9027: Use SIMD instructions to decode postings](https://issues.apache.org/jira/browse/LUCENE-9027) [1]: https://github.com/jpountz/vectorized-prefix-sum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org