atris commented on PR #14525:
URL: https://github.com/apache/lucene/pull/14525#issuecomment-2824857619

   @jpountz Thanks! Here is the paper: https://arxiv.org/abs/2104.08976
   
   Note that the core inspiration of this PR's approach comes from the paper, 
but the implementation diverges in certain ways:
   
   The paper talks about using bins mainly for hard cutoffs and filtering. The 
PR, instead, uses bins to compute adaptive score boosts, and wire that directly 
using the new Collector.
   
   The PR also adds:
        •       index-time graph-based binning (exact + approximate). This adds 
minimal indexing latency but gives significant improvements in search time.
        •       bin-level boosting at segment level
   
   So while the high-level idea overlaps, the implementation is more ambitious 
and also opens doors for implementations like bin skipping and multiple fields 
support and then use graph intersection or fusion to identify documents that 
are strongly connected across multiple semantic dimensions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to