Talking about performances you should take a look to the difference in
performance between :


- disjunction of K sorted arrays ( n*k*log(k)) in Lucene - where *k* are
the disjunction clauses and *n* the average posting list size (just learned
today from an expert lucene committer))

- conjunction of K sorted arrays - not 100 % sure about the complexity, i
should check concretely the algorithm, but i suggest there is no
difference, or not so much difference ( I would be glad someone here to
show the resources, or knowledge) .

basically when dealing with union or intersection of sorted arrays, the
algorithm that solve the two problems are quite comparable in term of
performances.

I would say that the performance difference is irrelevant but i would like
someone to contradict me .

Cheers



2015-07-15 17:34 GMT+01:00 Steven White <swhite4...@gmail.com>:

> Hi Erick,
>
> I understand there are variables that will impact ranking.  However, if I
> leave my edismax setting as is and simply switch from AND to OR as the
> default Boolean, now if a user types "apples oranges" (without quotes) will
> the ranking be the same as when I had AND?  Will the performance be the
> same as when I had AND as the default?
>
> Thanks
>
> Steve
>
> On Wed, Jul 15, 2015 at 12:26 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
> > This is really an apples/oranges comparison. They're essentially
> different
> > queries, and scores aren't comparable across different queries.
> >
> > If you're asking "if doc 1 and doc 2 are returned by defaulting to AND or
> > OR,
> > are they in the same position relative to each other?" then I'm pretty
> > sure the
> > answer is "you can't count on it". You'll match on different fields
> > depending on
> > what the default is, and with boosting you just don't know.
> >
> > Best,
> > Erick
> >
> > On Wed, Jul 15, 2015 at 9:14 AM, Steven White <swhite4...@gmail.com>
> > wrote:
> > > By the way, using OR as the default, other than returning more results
> as
> > > more words are entered, the ranking and performance of the search
> remains
> > > the same right?
> > >
> > > Steve
> > >
> > > On Wed, Jul 15, 2015 at 12:12 PM, Steven White <swhite4...@gmail.com>
> > wrote:
> > >
> > >> Thank you all.  Looks like OR is a better choice vs. AND.
> > >>
> > >> Charles: I don't understand what you mean by the "spellcheck
> component".
> > >> Do you mean OR works best with spell checker?
> > >>
> > >> Steve
> > >>
> > >> On Wed, Jul 15, 2015 at 11:07 AM, Reitzel, Charles <
> > >> charles.reit...@tiaa-cref.org> wrote:
> > >>
> > >>> A common approach to this problem is to include the spellcheck
> > component
> > >>> and, if there are corrections, include a "Did you mean ..." link in
> the
> > >>> results page.
> > >>>
> > >>> -----Original Message-----
> > >>> From: Walter Underwood [mailto:wun...@wunderwood.org]
> > >>> Sent: Wednesday, July 15, 2015 10:36 AM
> > >>> To: solr-user@lucene.apache.org
> > >>> Subject: Re: Which default Boolean operator to set, AND or OR?
> > >>>
> > >>> The AND default has one big problem. If the user misspells a single
> > word,
> > >>> they get no results. About 10% of queries are misspelled, so that
> > means a
> > >>> lot more failures.
> > >>>
> > >>> wunder
> > >>> Walter Underwood
> > >>> wun...@wunderwood.org
> > >>> http://observer.wunderwood.org/  (my blog)
> > >>>
> > >>>
> > >>> On Jul 15, 2015, at 7:21 AM, Jack Krupansky <
> jack.krupan...@gmail.com>
> > >>> wrote:
> > >>>
> > >>> > It is simply precision (AND) vs. recall (OR) - the former tries to
> > >>> > limit the total result count, while the latter tries to focus on
> > >>> > relevancy of the top results even if the total result count is
> > higher.
> > >>> >
> > >>> > Recall is good for discovery and browsing, where you sort of know
> > what
> > >>> > you generally want, but not exactly with any great precision.
> > >>> >
> > >>> > Recall will include results that almost meet the query terms, but
> > >>> > maybe some are missing.
> > >>> >
> > >>> > Precision will guarantee and insist that all query terms are
> present.
> > >>> >
> > >>> > One great example for recall is a plagiarism query - enter all the
> > >>> > terms for a passage and then find documents that most closely
> > >>> > approximate the passage without being necessarily exact matches.
> IOW,
> > >>> > the plagiarizer changes a word here and there.
> > >>> >
> > >>> > -- Jack Krupansky
> > >>> >
> > >>> > On Wed, Jul 15, 2015 at 8:16 AM, Steven White <
> swhite4...@gmail.com>
> > >>> wrote:
> > >>> >
> > >>> >> Hi Everyone,
> > >>> >>
> > >>> >> Out-of-the box, Solr (Lucene?) is set to use OR as the default
> > >>> >> Boolean operator.  Can someone tell me the advantages /
> > disadvantages
> > >>> >> of using OR or AND as the default?
> > >>> >>
> > >>> >> I'm leaning toward AND as the default because the more words a
> user
> > >>> >> types, the narrower the result set should be.
> > >>> >>
> > >>> >> Thanks
> > >>> >>
> > >>> >> Steve
> > >>> >>
> > >>>
> > >>>
> > >>>
> > *************************************************************************
> > >>> This e-mail may contain confidential or privileged information.
> > >>> If you are not the intended recipient, please notify the sender
> > >>> immediately and then delete it.
> > >>>
> > >>> TIAA-CREF
> > >>>
> > *************************************************************************
> > >>>
> > >>>
> > >>
> >
>



-- 
--------------------------

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to