: > title:(gone with the wind)^3.0 OR title2:(gone with the wind)
: That did it!  Thanks for the Help!
: What value do the numbers carry in the ranking?  I arbitrarily choose
: the number 5 cause it's an easy number :)

query boosts are in fact pretty arbitrary ... what you should pick really
depends on what boosts you put on other clauses, and what kinds of values
the tf, idf, and coord functions of your Similarity are going to return.

: I am a bit nervous about the dismax query system as I have quite a bit
: of other content that could skew the results.

i'm really not sure what you mean by that ... dismax will only look at the
fields you tell it to, and the factors that contribute to the score each
term/document pair in a dismax query will be the same as those from the
standard request handler -- the only differnece is how those individual
TermQuery scores are combined.

: Whats the difference between the dismax query handler and listing all of
: the fields in my search and separating them with an OR?

the best way to udnerstand this is too look at the debug output you get
from each query, and read the "Explanation" section ... some of the deep
detals may not make much sense, but the overall structure of score
calculation should be helpful

in a nutshell, when you ask the StandardRequestHandler for docs
matching...
     q = title:(foo bar) other:(foo bar)

if a document matches both title:foo, other:foo, and other:bar then the
score for that document is (esentially) the sum of the scores from
matching the individual terms

with dismax, if you ask for

     q = foo bar  & qf = title other

then the score for the same document is different: the matches on
the word "foo" are considered together regardless of field, and only the
field that resulted in the highest score is used (with a small portion of
hte matches on the otherfields being included to help break ties).  the
score contribution from matching on other;bar is basically the same as
before.

The driving motivation for the DisjunctionMaxQuery was so that if you
wanted to search for the words "Java" or "Lucene" in 3 differnet fields:
title, description, and body a document that matched Lucene once in the
body field, but matched Java dozens of times and at least once in each
field wouldn't overshadow a documetn that matched both Lucene and Java
just once in each field.


-Hoss

Reply via email to