On 30-Aug-07, at 4:01 PM, Chris Hostetter wrote:


You could accomplish the goal without any coding by using phrase queries: "calico calico calico"~10000 will match only documents that have at least three occurrences of calico. If this is performant enough, you are done. Otherwise, you'll have to do some custom coding.

I'll be searching article content so literals like "cat cat cat" are improbable.

i think you missunderstood Mike's point ... the query string...
     foo:"cat cat cat"~10000

...will only match documents containing three instances of the term "cat" in the field "foo" where those instances are all withing 10000 term positions of eachother ... hte idea being that as long as the "slop" (number) used is bigger then the largest document you expect to deal with, this will esentially give you want you want.

Note too that by default solr only indexes the first 10k tokens, so this should work for all documents in the index.

-Mike

Reply via email to