msokolov commented on PR #14226: URL: https://github.com/apache/lucene/pull/14226#issuecomment-2692803205
Sorry, it took me a while to get back to this. My local setup got messed up and somehow I was exhaustively searching entire segments! Anyway I finally got this working again, including tracking reentries so we can see better what's going on. I also have a modified version of luceneutil that reports on this which I will post over there, and I named the free parameter `lambda` so we can call it something. Detailed results on the Cohere 768 data below. My summary is that: when we have generous parameter settings (either or both high `fanout` or high `lambda`) the reentries don't add anything; the result queues are already large enough to capture the global top K. But at lower levels of these parameters, reentry allows some recovery of results that would otherwise have been overlooked. I guess the way I'm thinking about this is it is essentially equivalent to fanout, although a bit better because it scales with partition size, and additionally can serve as a safety measure in case o f highly skewed data. We wouldn't want to adopt this (more efficient) pro-rated strategy without it because it is vulnerable to that adversarial case. Maybe we should have a more large-scale dataset that demonstrates this. EG one where a timestamp is a key part of the vector data and the documents are indexed over time. # comparing reentry with no reentry ## LAMBDA=3, no re-entry ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited reentries index s index docs/s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.732 5.433 500000 50 0 64 250 no 6218 0 0.00 Infinity 8 1501.56 1464.844 1464.844 0.770 7.389 500000 100 0 64 250 no 8849 0 0.00 Infinity 0 0.00 0.000 0.000 0.837 7.338 500000 50 50 64 250 no 8849 0 0.00 Infinity 0 0.00 0.000 0.000 0.832 9.366 500000 100 50 64 250 no 11174 0 0.00 Infinity 0 0.00 0.000 0.000 0.876 9.211 500000 50 100 64 250 no 11174 0 0.00 Infinity 0 0.00 0.000 0.000 0.867 11.018 500000 100 100 64 250 no 13375 0 0.00 Infinity 0 0.00 0.000 0.000 ``` ## LAMBDA=3, with reentry ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited reentries index s index docs/s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.806 6.131 500000 50 0 64 250 no 6560 1128 0.00 Infinity 8 1501.56 1464.844 1464.844 0.838 8.953 500000 100 0 64 250 no 9460 1197 0.00 Infinity 0 0.00 0.000 0.000 0.855 7.813 500000 50 50 64 250 no 8923 179 0.00 Infinity 0 0.00 0.000 0.000 0.862 10.191 500000 100 50 64 250 no 11405 393 0.00 Infinity 0 0.00 0.000 0.000 0.880 9.364 500000 50 100 64 250 no 11200 52 0.00 Infinity 0 0.00 0.000 0.000 0.880 11.462 500000 100 100 64 250 no 13494 172 0.00 Infinity 0 0.00 0.000 0.000 ``` ## LAMBDA=5, with reentry ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited reentries index s index docs/s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.837 6.932 500000 50 0 64 250 no 7814 415 0.00 Infinity 8 1501.56 1464.844 1464.844 0.858 9.765 500000 100 0 64 250 no 10959 497 0.00 Infinity 0 0.00 0.000 0.000 0.876 9.107 500000 50 50 64 250 no 10739 76 0.00 Infinity 0 0.00 0.000 0.000 0.879 11.781 500000 100 50 64 250 no 13420 175 0.00 Infinity 0 0.00 0.000 0.000 0.893 11.336 500000 50 100 64 250 no 13316 7 0.00 Infinity 0 0.00 0.000 0.000 0.892 13.209 500000 100 100 64 250 no 15752 69 0.00 Infinity 0 0.00 0.000 0.000 ``` ## LAMBDA=5, with no reentry ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited reentries index s index docs/s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.803 6.554 500000 50 0 64 250 no 7682 0 0.00 Infinity 8 1501.56 1464.844 1464.844 0.823 9.062 500000 100 0 64 250 no 10707 0 0.00 Infinity 0 0.00 0.000 0.000 0.870 9.137 500000 50 50 64 250 no 10707 0 0.00 Infinity 0 0.00 0.000 0.000 0.866 11.392 500000 100 50 64 250 no 13313 0 0.00 Infinity 0 0.00 0.000 0.000 0.893 11.330 500000 50 100 64 250 no 13313 0 0.00 Infinity 0 0.00 0.000 0.000 0.888 13.281 500000 100 100 64 250 no 15705 0 0.00 Infinity 0 0.00 0.000 0.000 ``` ## LAMBDA=12, no re-entry ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited reentries index s index docs/s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.886 9.955 500000 50 0 64 250 no 12100 0 0.00 Infinity 8 1501.56 1464.844 1464.844 0.893 13.789 500000 100 0 64 250 no 16645 0 0.00 Infinity 0 0.00 0.000 0.000 0.905 13.922 500000 50 50 64 250 no 16645 0 0.00 Infinity 0 0.00 0.000 0.000 0.904 16.443 500000 100 50 64 250 no 20215 0 0.00 Infinity 0 0.00 0.000 0.000 0.911 16.797 500000 50 100 64 250 no 20215 0 0.00 Infinity 0 0.00 0.000 0.000 0.910 19.732 500000 100 100 64 250 no 23274 0 0.00 Infinity 0 0.00 0.000 0.000 ``` ## LAMBDA=12, with re-entry ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited reentries index s index docs/s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.887 9.826 500000 50 0 64 250 no 12108 20 0.00 Infinity 8 1501.56 1464.844 1464.844 0.895 13.490 500000 100 0 64 250 no 16666 39 0.00 Infinity 0 0.00 0.000 0.000 0.905 13.616 500000 50 50 64 250 no 16645 0 0.00 Infinity 0 0.00 0.000 0.000 0.904 16.507 500000 100 50 64 250 no 20217 4 0.00 Infinity 0 0.00 0.000 0.000 0.911 16.565 500000 50 100 64 250 no 20215 0 0.00 Infinity 0 0.00 0.000 0.000 0.910 19.024 500000 100 100 64 250 no 23274 1 0.00 Infinity 0 0.00 0.000 0.000 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org