mikemccand commented on a change in pull request #973: LUCENE-9027: Use SIMD instructions to decode postings. URL: https://github.com/apache/lucene-solr/pull/973#discussion_r347145282
########## File path: lucene/core/src/java/org/apache/lucene/store/ByteBufferIndexInput.java ########## @@ -107,6 +122,24 @@ public final void readBytes(byte[] b, int offset, int len) throws IOException { } } + @Override + public void readLongs(ByteOrder byteOrder, long[] dst, int offset, int length) throws IOException { + try { + final int position = curBuf.position(); + guard.getLongs(curLongBufferViews[position & 0x07].position(position >>> 3), dst, offset, length); Review comment: OK I must be a little confused here -- the loop above under the lazy init `if` (`for (int i = 0; i < Math.min(Long.BYTES, curBuf.limit()); ++i) {`) seems like it is initializing up to `64` clones, yet the array access here (`position & 0x07`) seems to only use the first `8`. And I thought the padding would only be needed at the start of postings lists, and only for terms whose frequency is `>= 128`, not at the start of each block, but you're right, since we have the one extra one leading byte (to tell us `numBits`) for each block, pad bytes would indeed need to be for each block, hrmph. Nevermind ;) I'm also surprised this trick actually helps -- under the hood the CPU must still do unaligned long decodes (which I think modern X86-64 are good at)? Maybe add a comment about why this trick is worthwhile? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org