: Maybe what I really need is a query parser that does not do "disjunction
: maximum" at all, but somehow still combines different 'qf' type fields with
: different boosts on each field. I personally don't _neccesarily_ need the
: actual "disjunction max" calculation, but I do need combining of mutiple
: fields with different boosts. Of course, I'm not sure exactly how it would
: combine multiple fields if not "disjunction maximum", but perhaps one is
: conceivable that wouldn't be subject to this particular gotcha with differing
: analysis.

you can sort of do that today, something like this should work...

 q  = _query_:"$q1"^100 _query_:"$q2"^10 _query_:"$q3"^5 _query_:"$q4"
 q1 = {!lucene df=title v=$qq}
 q2 = {!lucene df=summary v=$qq}
 q3 = {!lucene df=author v=$qq}
 q4 = {!lucene df=body v=$qq}
 qq = ...user input here...

..but you might want to replace "lucene" with "field" depending on what 
metacharacters you want to support.

in general though the reason i wrote the dismax parser (instead of a
parser that works like this) is because of how a multiword queries wind up 
matching/scoring.  A guy named Chuck Williams wrote the earliest 
versoin of the DisjunctionMaxQuery class and his "albino elephant" 
example totally sold me on this approach back in 2005...

http://www.lucidimagination.com/search/document/8ce795c4b6752a1f/contribution_better_multi_field_searching
https://issues.apache.org/jira/browse/LUCENE-323

: I also remain kind of confused about how the existing dismax figures out "how
: many terms" for the 'mm' type calculations. If someone wanted to explain that,
: I would find it enlightening and helpful for understanding what's going on.

it's not really about terms -- it's just the total number of clauses in 
the outer BooleanQuery that it builds.  if a chunk of input produces a 
valid DisjunctionMaxQuery (because the analyzer for at least one qf field 
generated tokens) then that's a clause, if a chunk of input doesn't 
produce a token (because none of hte analyzers from any of the qf ields 
generated tokens) then that's not a clause.


-Hoss

Reply via email to