msokolov commented on PR #14167: URL: https://github.com/apache/lucene/pull/14167#issuecomment-2619481888
I was thinking of another approach based on pro-rating. On its own this is deterministic and close to optimally efficient, but risks missing the best results when the index is skewed. If me that if the HNSW search could be made re-entrant, by preserving the state in the HnswSearcher (visited list, priority queues) then we could examine all the per-segment results after completing a pass through the graphs, and then revisit some segments more deeply if the results appear skewed. Basically the information-sharing would be done in a sequential, periodic fashion On Monday, January 27th, 2025 at 2:55 PM, Benjamin Trent ***@***.***> wrote: >> This made me wonder if it would be a better trade-off to let just one slice run on its own first, and then let all other N-1 slices run in parallel with one another, > > I really like this idea. For kNN search, it seems best to take the largest tiers, gather information from them, and then run the smaller tiers in parallel. > > The major downside of kNN is that there is no slicing at all. Every segment is just its own worker, which is sort of crazy. We should at a minimum combine all the tiny segments together into a single thread. > > What do you think ***@***.***(https://github.com/mayya-sharipova) ? Slicing the segments and then picking the "largest" slice and search that in current thread. Then using that information to help the future parallel threads? > > — > Reply to this email directly, [view it on GitHub](https://github.com/apache/lucene/pull/14167#issuecomment-2616760637), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAHHUQOHSZPE3IX6N2MXJ532M2FJXAVCNFSM6AAAAABVXSSWDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJWG43DANRTG4). > You are receiving this because you were mentioned.Message ID: ***@***.***> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org