jpountz commented on PR #14525: URL: https://github.com/apache/lucene/pull/14525#issuecomment-2828722257
> the implementation is more ambitious I like ambition, but it also makes this change harder to review/integrate, especially with the high LOC count. I would suggest splitting this PR into multiple PRs, for instance: - First PR just works with indexes created with existing recursive graph bisection and uses basic heuristics to determine which ranges of doc IDs to score first (e.g. using impacts) to hopefully increase the top-k score quickly. No extra data stored in the index. All code under lucene/misc rather than core. - Another PR can introduce the SLA-based termination logic. - Another PR can introduce topical clustering mechanism of Kulkarni and Callan, that the paper suggests combining with recursive graph bisection. - Another PR can discuss augmenting index formats to enhance the range selection logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org