benwtrent commented on PR #14097: URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573940308
If we really think `vint` is the cause, I wonder if we should switch encoding to the `readGroupVInts` stuff? https://github.com/apache/lucene/issues/12871 My thought around the bp reordering helping was dependent on the order of the vectors read is actually helping the vector indexing because clusters of vectors are indexed together. Honestly, bootstrapping HNSW via clusters is a really nice idea (and could help solve the "merge cost problem" by reusing clusters during merge...). The numbers here are really nice. I just want to understand why they were better, especially as recall changes, which seems to indicate that the graph building itself is being changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org