benwtrent commented on PR #14226: URL: https://github.com/apache/lucene/pull/14226#issuecomment-2657612038
I am not 100% sure whats up with the behavior. However, I switched to `16` (also happens to be the graph conn) instead of `3`. Its interesting how visited is lower, but recall is on par with the current non-consistent baseline. I don't like the magic "16" here, without knowing exactly why its working. I also fixed visited in Lucene util (accounting for the multiple topdoc merges). Here is NOT reusing the entry point scores ``` recall latency(ms) nDoc topK fanout visited num segments selectivity 0.942 10.400 8000000 100 100 68605 128 1.000 ``` Here IS reusing the entry point scores (have a local refactor). So, visited drops just a little. ``` recall latency(ms) nDoc topK fanout visited num segments selectivity 0.942 7.400 8000000 100 100 68477 128 1.000 ``` here is reusing entry point, single threaded: ``` recall latency(ms) nDoc topK fanout visited num segments selectivity 0.942 49.300 8000000 100 100 68477 128 1.000 ``` Note, the visited for a SINGLE graph is around 3000. Also, as a further baseline, here is no information sharing (e.g. what it was before our in-consistent multi-threaded results). e.g. just fully exploring every graph: ``` recall latency(ms) nDoc topK fanout visited num segments selectivity 1.000 213.400 8000000 100 100 412280 128 1.000 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org