msokolov opened a new pull request, #13910:
URL: https://github.com/apache/lucene/pull/13910

   While exploring some recall-related failures in another PR I went looking 
for a unit test that checks HNSW/KNN recall and couldn't find any. I think we 
used to have one but maybe we removed it because it was flaky? But we really do 
need such a test since it is possible to make changes that preserve all the 
formal properties of the codecs and queries yet destroy recall. I thought if we 
can create such a test with known data and vectors it would be more predictable 
than one using random data, so I made one, and it uncovered a couple of bugs: 
   
   In Lucene90HnswVectorsReader we messed up (removed) ord-to-doc mappings so 
we were returning vector ords instead of docids in search results. I guess this 
would have totally borked back-compat for Lucene90 indexes. Probably there are 
none in the wild, and this was never noticed?
   
   In Lucene91RWFormat (used only for back-compat testing) we messed up 
diversity check so we were producing bad graphs.
   
   This PR fixes these things and adds the new test
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to