ChrisHegarty commented on issue #12396:
URL: https://github.com/apache/lucene/issues/12396#issuecomment-1620696514

   I would like to pause a little, double check where we're going, and reset if 
needed. It's become clear to me that we're not quite aligned.
   
   The main issue I see is with the interface. All the code I've seen so far is 
using `long[]`, and then consequently `LongVector` in the Panama 
implementation. After chatting a little with @jpountz offline, and re-reading 
previous comments, it's clear that this is not what we want.
   
   What we want is `int[]` and `IntVector`.  Using longs in the new 
implementation is wasteful, both in terms of memory, but also (and more 
importantly) vector lanes.  My initial experiments with ints are showing even 
bigger speed ups, approx 4x (but I've not implemented it all yet).
   
   If we're in agreement with the above, then I'd like to propose that we go 
back to the basics - implement all the various bitsPerValue variants for 
Int128Vector species. This will effectively be a port of 
https://github.com/lemire/simdcomp/blob/master/src/simdbitpacking.c.   We can 
then compare these to the existing ForUtil implementations and tweak the actual 
code as necessary. I've started this in this repo 
https://github.com/ChrisHegarty/bitpacking (a pure port for now, we can analyse 
the benchmarks and tweak the code once complete )
   
   Clearly, there is a lot more to do: a scalar implementation will be required 
of the new format (and with `int[]`), we'll need to benchmark it too; the 
calling code including, PForUtil, etc, will need to be updated to use `int[]`; 
provide 256 and 512 species variants, etc.  But that should be relatively 
straightforward, let's just put that aside until we get the full 1-32 
encode/decode all implemented for `int[]` and `Int128Vector` first. (which I'll 
continue to do in [bitpacking][1]). 
   
   Have I missed something? What do you think? 
   
   [1]: https://github.com/ChrisHegarty/bitpacking


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to