Hi Deepak,

As Hoss explains it, there wouldn't be any effect of changing the order of
individual search terms.

In addition, you could look at the Scoring algo:
http://lucene.apache.org/core/2_9_4/scoring.html#Algorithm,
http://lucene.apache.org/core/2_9_4/api/core/org/apache/lucene/search/package-summary.html#scoring

Also the advancing logic is in the Scorer's (inherited) advance method:
http://lucene.apache.org/core/2_9_4/api/core/org/apache/lucene/search/DocIdSetIterator.html#advance%28int%29

- For the AND query a leap frog/ skip ahead pattern is implemented at the
BooleanScorer2 (ConjunctionScorer) level.

For the query, q=A AND B

where A & B match doc. id's
A -> 1,3,5,7,11,15,17
B -> 2, 6

- Scorers start with the min. of each, i.e. A -> 1  & B -> 2, & current
highest doc id set to 2

In the next few iterations:
A is advanced past the current highest value to 3 & current highest updated
to 3.
B advanced past current highest 3 to 6 & current highest updated to 6.
A advanced past 6 to 7 & current highest updated to 7.
B has no more docs & this breaks out, without any match.

- On the other hand if the two had converged/ agreed on a particular doc
id, that doc would be scored & collected (added to a min-heap).

Please add if there's something amiss in the explanation.

Regards,
Aloke


On Sat, Aug 31, 2013 at 5:43 AM, Chris Hostetter
<hossman_luc...@fucit.org>wrote:

>
> : while the results would be identical irrespective of the order in which
> you
> : AND
> : A, B and C, will the response times of the following queries differ in
> any
> : way?
> :
> : C && B && A
> : A && B && C
>
> The queries won't be *cached* the same at the solr level, because the
> BooleanQuery generated by the parsers won't be 100% identical, but the
> execution of those (uncached) queries should be virtualy indential.
>
> : Does Lucene/Solr pick the best query execution plan in terms of both
> space
> : and time for a given query?
>
> It's not a "query execution plan" so much as it is a "skip ahead" pattern.
> the Sub-Scorers for each of the sub-queries are looped over and each one
> is asked to identify the "first" doc id (X) that it matches, after or
> equal to the "first" doc id (Y) returned by the last sub-query consulted
> -- starting with Y=0.  And each time a new "X" is found, the looping
> starts again on the remaining subscorers until a match is found (or we run
> out of documents)
>
> So regardless of what the order of clauses are in the original
> BooleanQuery, the Scorers for each clause are constantly reordered based
> on what the "first" document they match after the currently considered
> document is.
>
> -Hoss
>

Reply via email to