benwtrent commented on PR #14160: URL: https://github.com/apache/lucene/pull/14160#issuecomment-2627228519
I did some more testing, this time single segment of our nightly runs. The recall & latency pattern is much healthier with this change, though the recall is lower. The only reason the recall is so high for the restricted filters is that the baseline over-eagerly drops to brute-force because it spends way too much time doing vector comparisons. <img width="594" alt="image" src="https://github.com/user-attachments/assets/5f6d5f4f-d053-4e50-9858-fb1e4d8f2023" /> BASELINE ``` recall latency (ms) nDoc topK fanout visited selectivity 1.000 131.763 8000000 100 50 79814 0.010 0.924 50.518 8000000 100 50 53003 0.050 0.912 18.970 8000000 100 50 30095 0.100 0.896 10.697 8000000 100 50 16942 0.200 0.884 7.509 8000000 100 50 12057 0.300 0.876 5.763 8000000 100 50 9476 0.400 0.869 4.792 8000000 100 50 7905 0.500 0.863 4.184 8000000 100 50 6777 0.600 0.858 3.781 8000000 100 50 5966 0.700 0.853 3.403 8000000 100 50 5351 0.800 0.850 3.084 8000000 100 50 4855 0.900 0.849 3.044 8000000 100 50 4645 0.950 0.848 2.927 8000000 100 50 4492 0.990 ``` Candidate: ``` recall latency (ms) nDoc topK fanout visited selectivity 0.481 4.976 8000000 100 50 2162 0.010 0.714 7.366 8000000 100 50 4141 0.050 0.789 8.558 8000000 100 50 7222 0.100 0.816 9.448 8000000 100 50 10318 0.200 0.803 7.908 8000000 100 50 10281 0.300 0.796 7.088 8000000 100 50 9406 0.400 0.767 4.415 8000000 100 50 5909 0.500 0.791 4.280 8000000 100 50 5838 0.600 0.807 3.892 8000000 100 50 5677 0.700 0.820 3.708 8000000 100 50 5291 0.800 0.833 3.088 8000000 100 50 4481 0.900 0.840 2.902 8000000 100 50 4308 0.950 0.846 2.959 8000000 100 50 4418 0.990 ``` ``` recall latency (ms) nDoc topK fanout visited selectivity 0.714 7.722 8000000 100 50 4141 0.050 0.721 7.813 8000000 100 60 4329 0.050 0.728 8.158 8000000 100 70 4515 0.050 0.734 8.656 8000000 100 80 4701 0.050 0.741 8.719 8000000 100 90 4885 0.050 0.746 6.566 8000000 100 100 5063 0.050 0.751 6.493 8000000 100 110 5239 0.050 0.756 6.913 8000000 100 120 5416 0.050 0.761 7.186 8000000 100 130 5585 0.050 0.765 7.595 8000000 100 140 5756 0.050 0.769 7.150 8000000 100 150 5923 0.050 0.773 8.241 8000000 100 160 6093 0.050 0.777 8.056 8000000 100 170 6255 0.050 0.780 8.386 8000000 100 180 6424 0.050 0.784 8.963 8000000 100 190 6584 0.050 0.787 8.786 8000000 100 200 6743 0.050 0.790 9.459 8000000 100 210 6902 0.050 0.793 8.709 8000000 100 220 7058 0.050 0.796 8.881 8000000 100 230 7213 0.050 0.798 9.612 8000000 100 240 7367 0.050 0.801 9.527 8000000 100 250 7519 0.050 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org