[
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103601#comment-17103601
]
David Smiley commented on SOLR-13289:
-------------------------------------
Should we really add {{numFoundExact="true"}} on responses where the user
didn't even specify a parameter to control this new feature? I prefer not
adding the noise.
I like the name {{numFoundExact}} in the response compared to others we
explored – a last minute change I see. Wouldn't we want the controlling
parameter to use "numFound" likewise instead of "hits"? I propose
{{minNumFoundToBeExact}}. The word "hits" isn't particularly widespread in
Solr, except for cache hits.
I spent some time today reviewing what you pushed more closely, and especially
testing my theory that there is a problem with interactions with the Collapse
PostFilter/Collector. +There is, albeit not a big problem.+ Essentially the
Collapse PostFilter must see and cache all docs before passing those it deems
appropriate on to the rest of the collectors. TopDocs Collector is downstream
of it, and TDC tries to tell the Scorer to do approximation stuff but it is in
vain because by this point, all the docs are already accumulated cached with
Collapse. Other than a possible waste in computation, it ultimately results in
Solr saying that the results weren't exact when they are actually exact.
I pushed a commit to my fork to demonstrate the problem:
[https://github.com/dsmiley/lucene-solr/commit/8803db97a5e4deb0ad5f3bdaabd02cd3b302a09f]
Interestingly I see some other test failures there.
I think the solution is in
{{org.apache.solr.search.SolrIndexSearcher#getDocListNC}} in the second half of
the method ({{lastDocRequested <= 0}} i.e. top-X results case), right before
{{buildTopDocsCollector}} in invoked, set
{{cmd.setMinExactHits(Integer.MAX_VALUE);}} only if {{pf.postFilter.scoreMode}}
isn't null and isn't TOP_SCORES, thus it's one of the two COMPLETE options.
COMPLETE means the Scorer needs yield all matching docs.
> Support for BlockMax WAND
> -------------------------
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
> Issue Type: New Feature
> Reporter: Ishan Chattopadhyaya
> Assignee: Tomas Eduardo Fernandez Lobbe
> Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
> Time Spent: 3h 40m
> Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to
> expose this via Solr. When enabled, the numFound returned will not be exact.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]