jpountz commented on code in PR #14896: URL: https://github.com/apache/lucene/pull/14896#discussion_r2186132941
########## lucene/core/src/java24/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java: ########## @@ -1001,4 +1007,26 @@ public float recalculateScalarQuantizationOffset( return correction; } + + @Override + public int filterWithDouble(int[] docBuffer, double[] scoreBuffer, double threshold, int upTo) { + int newUpto = 0; + int i = 0; + for (int bound = upTo - DOUBLE_SPECIES.length() + 1; i < bound; i += DOUBLE_SPECIES.length()) { Review Comment: Can you use VectorSpecies#loopBound to make this more idiomatic? ########## lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorizationProvider.java: ########## @@ -217,7 +217,8 @@ private static Optional<Module> lookupVectorModule() { "org.apache.lucene.util.VectorUtil", "org.apache.lucene.codecs.lucene103.Lucene103PostingsReader", "org.apache.lucene.codecs.lucene103.PostingIndexInput", - "org.apache.lucene.tests.util.TestSysoutsLimits"); + "org.apache.lucene.tests.util.TestSysoutsLimits", + "org.apache.lucene.search.ScorerUtil"); Review Comment: This shouldn't be necessary since ScorerUtil calls the new method via VectorUtil which is already whitelisted? ########## lucene/core/src/java24/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java: ########## @@ -1001,4 +1007,26 @@ public float recalculateScalarQuantizationOffset( return correction; } + + @Override + public int filterWithDouble(int[] docBuffer, double[] scoreBuffer, double threshold, int upTo) { + int newUpto = 0; + int i = 0; + for (int bound = upTo - DOUBLE_SPECIES.length() + 1; i < bound; i += DOUBLE_SPECIES.length()) { + DoubleVector scoreVector = DoubleVector.fromArray(DOUBLE_SPECIES, scoreBuffer, i); + IntVector docVector = IntVector.fromArray(INT_FOR_DOUBLE_SPECIES, docBuffer, i); + VectorMask<Double> mask = scoreVector.compare(VectorOperators.GE, threshold); + docVector.compress(mask.cast(INT_FOR_DOUBLE_SPECIES)).intoArray(docBuffer, newUpto); + scoreVector.compress(mask).intoArray(scoreBuffer, newUpto); Review Comment: nit: it would be nicer to compress vectors in the same order as they were declared a few lines above ########## lucene/core/src/java/org/apache/lucene/util/VectorUtil.java: ########## @@ -376,4 +376,24 @@ public static float recalculateOffset( return IMPL.recalculateScalarQuantizationOffset( vector, oldAlpha, oldMinQuantile, scale, alpha, minQuantile, maxQuantile); } + + /** + * filter both docBuffer and scoreBuffer with threshold, each docBuffer and scoreBuffer of the + * same index forms a pair, pairs with score less than threshold will be filtered out from the + * array. + * + * @param docBuffer doc buffer contains docs (or some other value forms a pair with scoreBuffer) + * @param scoreBuffer score buffer contains scores to be compared with threshold + * @param threshold minimal required double value to not be filtered out + * @param upTo where the filter should end + * @return how many pairs left after filter + */ + public static int filterWithDouble( + int[] docBuffer, double[] scoreBuffer, double threshold, int upTo) { Review Comment: nit: maybe rename threshold to `minScoreInclusive` to better convey expectations that this is a min score (as opposed to max) and that it is inclusive? ########## lucene/core/src/java/org/apache/lucene/util/VectorUtil.java: ########## @@ -376,4 +376,24 @@ public static float recalculateOffset( return IMPL.recalculateScalarQuantizationOffset( vector, oldAlpha, oldMinQuantile, scale, alpha, minQuantile, maxQuantile); } + + /** + * filter both docBuffer and scoreBuffer with threshold, each docBuffer and scoreBuffer of the + * same index forms a pair, pairs with score less than threshold will be filtered out from the + * array. + * + * @param docBuffer doc buffer contains docs (or some other value forms a pair with scoreBuffer) + * @param scoreBuffer score buffer contains scores to be compared with threshold + * @param threshold minimal required double value to not be filtered out + * @param upTo where the filter should end + * @return how many pairs left after filter + */ + public static int filterWithDouble( Review Comment: I wonder if we can find a better name. `filterByScore` maybe? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org