ChrisHegarty commented on issue #12396: URL: https://github.com/apache/lucene/issues/12396#issuecomment-1620696514
I would like to pause a little, double check where we're going, and reset if needed. It's become clear to me that we're not quite aligned. The main issue I see is with the interface. All the code I've seen so far is using `long[]`, and then consequently `LongVector` in the Panama implementation. After chatting a little with @jpountz offline, and re-reading previous comments, it's clear that this is not what we want. What we want is `int[]` and `IntVector`. Using longs in the new implementation is wasteful, both in terms of memory, but also (and more importantly) vector lanes. My initial experiments with ints are showing even bigger speed ups, approx 4x (but I've not implemented it all yet). If we're in agreement with the above, then I'd like to propose that we go back to the basics - implement all the various bitsPerValue variants for Int128Vector species. This will effectively be a port of https://github.com/lemire/simdcomp/blob/master/src/simdbitpacking.c. We can then compare these to the existing ForUtil implementations and tweak the actual code as necessary. I've started this in this repo https://github.com/ChrisHegarty/bitpacking (a pure port for now, we can analyse the benchmarks and tweak the code once complete ) Clearly, there is a lot more to do: a scalar implementation will be required of the new format (and with `int[]`), we'll need to benchmark it too; the calling code including, PForUtil, etc, will need to be updated to use `int[]`; provide 256 and 512 species variants, etc. But that should be relatively straightforward, let's just put that aside until we get the full 1-32 encode/decode all implemented for `int[]` and `Int128Vector` first. (which I'll continue to do in [bitpacking][1]). Have I missed something? What do you think? [1]: https://github.com/ChrisHegarty/bitpacking -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org