uschindler commented on a change in pull request #18: URL: https://github.com/apache/lucene/pull/18#discussion_r594952973
########## File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java ########## @@ -17,16 +17,123 @@ package org.apache.lucene.util; +import java.lang.invoke.MethodHandle; +import java.lang.invoke.MethodHandles; +import java.lang.invoke.MethodType; +import java.util.Base64; + /** Utilities for computations with numeric arrays */ public final class VectorUtil { private VectorUtil() {} + // org.apache.lucene.util.VectorUtilSIMD#dotProduct(float[], float[]) + private static final String SIMD_BASE64 = + "yv66vgAAADwAbQoAAgADBwAEDAAFAAYBABBqYXZhL2xhbmcvT2JqZWN0AQAGPGluaXQ+AQADKClW\n" Review comment: > But yeah, it is true that maybe we can start working those other hotspots off as well. For example, IMO it is silly with mmap directory for us to be decoding byte[] slowly into a float[] (readLEFloats or whatever). Vector API can use byte[] or even ByteBuffer directly (I assume any conversions are vectorized too, have not experimented with that). It gets even worse with MMapDirectory version 2 for Java 16. So IMHO, once we are really on JDK 17 minimum, we should change the method signatures of IndexInpout and replace our `void readLEFloats(float[])` by `FloatVector readFloatVector()`, on MMapDirectory this can use a ByteBuffer oder MemorySegemnt directly on the mmapped contents. This would space millions of native->heap arraycopy actions for nonsense. ########## File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java ########## @@ -17,16 +17,123 @@ package org.apache.lucene.util; +import java.lang.invoke.MethodHandle; +import java.lang.invoke.MethodHandles; +import java.lang.invoke.MethodType; +import java.util.Base64; + /** Utilities for computations with numeric arrays */ public final class VectorUtil { private VectorUtil() {} + // org.apache.lucene.util.VectorUtilSIMD#dotProduct(float[], float[]) + private static final String SIMD_BASE64 = + "yv66vgAAADwAbQoAAgADBwAEDAAFAAYBABBqYXZhL2xhbmcvT2JqZWN0AQAGPGluaXQ+AQADKClW\n" Review comment: > But yeah, it is true that maybe we can start working those other hotspots off as well. For example, IMO it is silly with mmap directory for us to be decoding byte[] slowly into a float[] (readLEFloats or whatever). Vector API can use byte[] or even ByteBuffer directly (I assume any conversions are vectorized too, have not experimented with that). It gets even worse with MMapDirectory version 2 for Java 16. So IMHO, once we are really on JDK 17 minimum, we should change the method signatures of IndexInpout and replace our `void readLEFloats(float[])` by `FloatVector readFloatVector()`, on MMapDirectory this can use a ByteBuffer oder MemorySegemnt directly on the mmapped contents. This would spare millions of native->heap arraycopy actions for nonsense. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org