Ok let me explain what I am trying to do first since there may be a better
approach. Recently I had been trying to increase solr's matching precision
by requiring that all of the words in a field match before allowing a match
on a field. I am using edismax as my query parser and since it tokenizes on
white space there's no way to make sure that if my query is q=foo bar and I
have a field named somefield indexed as a text field with foo bar that foo
doesn't match and bar doesn't match but the phrase "foo bar" does match. 

I feel like I'm not explaining this very well but basically what I want to
do has already been done by Lucid works:
https://lucidworks.com/2014/07/02/automatic-phrase-tokenization-improving-lucene-search-precision-by-more-precise-linguistic-analysis/

However their solution requires that you use a pluggable query parser which
is not an extension of edismax. Now I haven't done a deep comparison but I'm
assuming I would lose access to all of edismax's parameters if I used their
pluggable query parser.

So instead I tried to replicate this functionality using edismax's pf2 and
pf3 parameters. It all works beautifully the way I have it setup except that
phrase field matches don't count towards my mm count. 

Ok so now I will go into detail about how I have my index setup for this
specific example.

I am using solr's default text field to index a field named manufacturer2

here are the relevant parameters of my search

q=livex lighting 8193
qf=productid, manufacturer_stop
pf2=manufacturer2
mm=3<-1 5<-2 6<90%

now I am stopping the word lighting from my manufacturer_stop field using
stopwords so only livex is matching in the manufacturer_stop field

However "livex lighting" is matching in the manufacturer2 field using phrase
field matching in the pf2 parameter.

so my matches are the following:
MATCH livex in manufacturer_stop field
MATCH 8193 in productid field
MATCH "livex lighting" in manufacturer 2 field as a phrase field match

so I have three matches... however the phrase field match doesn't seem be be
counting towards my mm match requirement of 3 tokens passed 3 must match. If
I change my mm to require only 2 tokens must match I get the expected
result. But I want my phrase field to count towards my mm match requirement
since lighting is matching in my phrase field.

Any assistance would be appreciated.... Or if someone could suggest a better
approach that would also be appreciated.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Phrase-field-matches-not-counting-towards-minimum-match-tp4322066.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to