[GitHub] [lucene] gsmiller commented on pull request #12417: forutil add vectorized and scalar code

via GitHub Thu, 13 Jul 2023 12:41:09 -0700


gsmiller commented on PR #12417:
URL: https://github.com/apache/lucene/pull/12417#issuecomment-1634807010


   OK, one last follow-up on my idea of using tighter lanes to get more 
concurrency for shift+mask while decoding: I educated myself a little bit more 
and explored one last idea of using a scatter instruction to copy the output 
vector to a byte[] that is padded with 0s such that each decoded value is 4 
bytes instead of 1. My thought was that we could potentially throw a ByteBuffer 
on top of this and interpret the data as integers. From some micro-benchmarks I 
ran, it looks like the "scatter" is very costly and tanks performance, so I'm 
convinced at this point it's not worth pursuing the "tighter lane" idea 
further. Here's essentially what I was doing just to close the loop (I was 
experimenting with 64 values at a time instead of 128, and a bit width of 2 
just to make things a bit simpler and only load data to a vector register once):
   
   ```
       private static final int[] scatterMap = new int[16];
       static {
           int upto = 3;
           for (int i = 0; i < 16; i++) {
               scatterMap[i] = upto;
               upto += 4;
           }
       }
   
       public static void unpackSimd2(byte[] in, byte[] out) {
           ByteVector inVec = ByteVector.fromArray(BYTE_SPECIES_128, in, 0);
           ByteVector outVec;
           int upto = 0;
           for (int shift = 0; shift < 8; shift += 2) {
               outVec = inVec.lanewise(VectorOperators.LSHR, 
shift).and(BYTE_MASK);
               outVec.intoArray(out, upto, scatterMap, 0);
               upto += 64;
           }
       }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on pull request #12417: forutil add vectorized and scalar code

Reply via email to