benwtrent commented on issue #13611:
URL: https://github.com/apache/lucene/issues/13611#issuecomment-3347751096

   @david-sitsky 
   
   Missing one document for one query that is getting top-k=100 is a 99% recall 
for that query. Which is really really good.
   
   If you are missing true top k of 5 or more for many queries (recall <95%), 
that might be concerning.
   
   have you sampled out a couple hundred queries and calculated the overall 
average recall? That is the best way to benchmark.
   
   If you want to always have the best vector, no matter what, all the time, 
that means recall of 100%, which then gets very expensive and approximate knn 
may not be the best thing for you.
   
   > . Also, the parent doc in this instance only has a single nested...
   
   I wouldn't expect that to be a significant factor. 
   
   You could try removing the parent/join thing, but I suspect you will run 
into other problems where you have to oversample a bunch of child docs as I 
expect multiple child documents to be similarly near the same query vector.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to