I searched my mail but nothing found. the thread searched by key words "boolean expression" is Indexing Boolean Expressions from joaquin.delgado to tell which terms are matched, for BooleanScorer2, a simple method is to modify DisjunctionSumScorer and add a BitSet to record matched scorers. When collector collect this document, it can get the scorer and recursively find the matched terms. But I think maybe it's better to add a component maybe named matcher that do the matching job, and scorer use the information from the matcher and do ranking things.
On Wed, Apr 11, 2012 at 4:32 PM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Hi, > > This use case is similar to matching boolean expression problem. You can > find recent thread about it. I have an idea that we can introduce > disjunction query with dynamic mm (minShouldMatch parameter > > http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/all/org/apache/lucene/search/BooleanQuery.html#setMinimumNumberShouldMatch(int) > ) > i.e. 'match these clauses disjunctively but for every document use > value > from field cache of field xxxCount as a minShouldMatch parameter'. Also > norms can be used as a source for dynamics mm values. > > Wdyt? > > On Wed, Apr 11, 2012 at 10:08 AM, Li Li <fancye...@gmail.com> wrote: > > > it's not possible now because lucene don't support this. > > when doing disjunction query, it only record how many terms match this > > document. > > I think this is a common requirement for many users. > > I suggest lucene should divide scorer to a matcher and a scorer. > > the matcher just return which doc is matched and why/how the doc is > > matched. > > especially for disjuction query, it should tell which term matches and > > possible other > > information such as tf/idf and the distance of terms(to support proximity > > search). > > That's the matcher's job. and then the scorer(a ranking algorithm) use > > flexible algorithm > > to score this document and the collector can collect it. > > > > On Wed, Apr 11, 2012 at 10:28 AM, Chris Book <chrisb...@gmail.com> > wrote: > > > > > Hello, I have a solr index running that is working very well as a > search. > > > But I want to add the ability (if possible) to use it to do matching. > > The > > > problem is that by default it is only looking for all the input terms > to > > be > > > present, and it doesn't give me any indication as to how many terms in > > the > > > target field were not specified by the input. > > > > > > For example, if I'm trying to match to the song title "dust in the > wind", > > > I'm correctly getting a match if the input query is "dust in wind". > But > > I > > > don't want to get a match if the input is just "dust". Although as a > > > search "dust" should return this result, I'm looking for some way to > > filter > > > this out based on some indication that the input isn't close enough to > > the > > > output. Perhaps if I could get information that that the number of > input > > > terms is much less than the number of terms in the field. Or something > > > else along those line? > > > > > > I realize that this isn't the typical use case for a search, but I'm > just > > > looking for some suggestions as to how I could improve the above > example > > a > > > bit. > > > > > > Thanks, > > > Chris > > > > > > > > > -- > Sincerely yours > Mikhail Khludnev > ge...@yandex.ru > > <http://www.griddynamics.com> > <mkhlud...@griddynamics.com> >