zhaih commented on code in PR #12480:
URL: https://github.com/apache/lucene/pull/12480#discussion_r1302276710
##########
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java:
##########
@@ -401,8 +404,23 @@ private boolean isDiverse(byte[] candidate, NeighborArray
neighbors, float score
* Find first non-diverse neighbour among the list of neighbors starting
from the most distant
* neighbours
*/
- private int findWorstNonDiverse(NeighborArray neighbors) throws IOException {
- int[] uncheckedIndexes = neighbors.sort();
+ private int findWorstNonDiverse(NeighborArray neighbors, int nodeOrd) throws
IOException {
+ int[] uncheckedIndexes = neighbors.sort(nbrOrd -> {
+ float[] vectorValue = null;
+ byte[] binaryValue = null;
+ switch (this.vectorEncoding) {
+ case FLOAT32 -> vectorValue = (float[]) vectors.vectorValue(nodeOrd);
Review Comment:
Let's take this part outside of lambda to reduce number of times we call
`vectorValue`, this operation involves some seek and parse operation on
off-heap memory.
##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -197,4 +206,8 @@ private int descSortFindRightMostInsertionPoint(float
newScore, int bound) {
}
return start;
}
+
+ interface ScoringFunction {
Review Comment:
Javadoc for both interface and the method?
##########
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java:
##########
@@ -211,16 +214,16 @@ private void initializeFromGraph(
oldNeighbor != NO_MORE_DOCS;
oldNeighbor = initializerGraph.nextNeighbor()) {
int newNeighbor = oldToNewOrdinalMap.get(oldNeighbor);
- float score =
- switch (this.vectorEncoding) {
- case FLOAT32 -> this.similarityFunction.compare(
- vectorValue, (float[])
vectorsCopy.vectorValue(newNeighbor));
- case BYTE -> this.similarityFunction.compare(
- binaryValue, (byte[])
vectorsCopy.vectorValue(newNeighbor));
- };
+// float score =
Review Comment:
Let's remove those commented code?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]