: how would you handle a query like "johnson AND johnson"? i don't want
: something that has "author: linden b. johnson" to hit, only things that
: actually have two occurrences.

I'm not even sure if/how that would be possible using the underlying 
lucene Query objects available -- IIUC the BooleanQuery(.Builder) class 
will optimize away duplicate clauses.

I think what you would need is something like TermQuery with a "minTf" 
option? ... the code for that probably wouldn't be too hard -- but not 
sure how you'd solve it for the general case, ex: "(+(A B) +(B C))" ... 
such that if neither A nor C match then there must be 2 instances of B)

Oh wait ... one way you could probably do this would be with SpanNotQuery?

I think sometihng like "SpanNotQuery(SpanTermQuery("johnson"), 
SpanTermQuery("johnson"))" would work.

I believe the only existing solr QParser that can create SpanNotQueries is 
the XMLQueryParser...

https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-XMLQueryParser


-Hoss
http://www.lucidworks.com/

Reply via email to