it's not possible now because lucene don't support this. when doing disjunction query, it only record how many terms match this document. I think this is a common requirement for many users. I suggest lucene should divide scorer to a matcher and a scorer. the matcher just return which doc is matched and why/how the doc is matched. especially for disjuction query, it should tell which term matches and possible other information such as tf/idf and the distance of terms(to support proximity search). That's the matcher's job. and then the scorer(a ranking algorithm) use flexible algorithm to score this document and the collector can collect it.
On Wed, Apr 11, 2012 at 10:28 AM, Chris Book <chrisb...@gmail.com> wrote: > Hello, I have a solr index running that is working very well as a search. > But I want to add the ability (if possible) to use it to do matching. The > problem is that by default it is only looking for all the input terms to be > present, and it doesn't give me any indication as to how many terms in the > target field were not specified by the input. > > For example, if I'm trying to match to the song title "dust in the wind", > I'm correctly getting a match if the input query is "dust in wind". But I > don't want to get a match if the input is just "dust". Although as a > search "dust" should return this result, I'm looking for some way to filter > this out based on some indication that the input isn't close enough to the > output. Perhaps if I could get information that that the number of input > terms is much less than the number of terms in the field. Or something > else along those line? > > I realize that this isn't the typical use case for a search, but I'm just > looking for some suggestions as to how I could improve the above example a > bit. > > Thanks, > Chris >