atris commented on PR #14525: URL: https://github.com/apache/lucene/pull/14525#issuecomment-2824857619
@jpountz Thanks! Here is the paper: https://arxiv.org/abs/2104.08976 Note that the core inspiration of this PR's approach comes from the paper, but the implementation diverges in certain ways: The paper talks about using bins mainly for hard cutoffs and filtering. The PR, instead, uses bins to compute adaptive score boosts, and wire that directly using the new Collector. The PR also adds: • index-time graph-based binning (exact + approximate). This adds minimal indexing latency but gives significant improvements in search time. • bin-level boosting at segment level So while the high-level idea overlaps, the implementation is more ambitious and also opens doors for implementations like bin skipping and multiple fields support and then use graph intersection or fusion to identify documents that are strongly connected across multiple semantic dimensions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org