The &fq query is probably using the wrong field. It will be using the "df"
parameter since fq uses the Solr query parser, not edismax.
-- Jack Krupansky
-----Original Message-----
From: David Quarterman
Sent: Tuesday, February 19, 2013 10:16 AM
To: solr-user@lucene.apache.org
Subject: RE: Edismax odd results
Hi,
This is definitely driving us mad now! Changed to PorterStemming and there's
very little difference.
If we add fq=engineer, we get 0 results. Add fq=engineer* and we get the 90
in the system. Try with fq=ankle* and we get 2. Correct. Try with
fq=harness* and we get 0!
The stemming reduces 'engineer' to 'engin' so I'd have expected a lot more
results.
Anyone got any ideas?
Regards,
DQ
-----Original Message-----
From: David Quarterman [mailto:da...@corexe.com]
Sent: 19 February 2013 17:09
To: solr-user@lucene.apache.org
Subject: RE: Edismax odd results
Hi Shawn/Jack,
The log shows the query going in okay, nothing gets stripped out so we're
still at a loss to understand this. Could it be theta Snowball stemming is
too invasive?
Regards,
DQ
-----Original Message-----
From: David Quarterman [mailto:da...@corexe.com]
Sent: 19 February 2013 16:38
To: solr-user@lucene.apache.org
Subject: RE: Edismax odd results
Hi Shawn,
I checked the admin analysis earlier. Stemming is taking 'engineer' down to
'engin', but then I'd have thought that a search on 'engin boots' would work
but it doesn't.
I'll try turning the wick back up on the logging - we set it to 'warning'.
Regards,
DQ
-----Original Message-----
From: Shawn Heisey [mailto:s...@elyograg.org]
Sent: 19 February 2013 16:25
To: solr-user@lucene.apache.org
Subject: Re: Edismax odd results
I do not see the word engineer (or any other similar word) in the score
calculation, only boots. A test on my own index shows both words in the
calculations. I would use the analysis admin page on the prodnameplurals
field to see what happens to the input of "engineer boots" on both index and
query - see what part of your analysis chain removes it.
If you don't see any problem there, then the Solr log (assuming you haven't
changed the default log level of INFO) should have a record of what
parameters were actually received when the query was made.
Thanks,
Shawn
On 2/19/2013 9:14 AM, David Quarterman wrote:
Hi Jack,
Here's q test query we've been using:
select?q=+engineer+boots&defType=edismax&fl=prodname&qf=prodnameplural
s&pf2=prodnameplurals^2.0
This still produces a result set where the first 'engineer boot' is way
down the list and subsequent ones are interspersed with other boots.
They're all in there, just not at the top. Below is the debug on the first
item that is an engineer boot.
<str name="ITEM_3333">
0.23492618 = (MATCH) sum of:
0.23492618 = (MATCH) product of:
0.46985236 = (MATCH) sum of:
0.46985236 = (MATCH) weight(prodnameplurals:boot in 48270)
[DefaultSimilarity], result of:
0.46985236 = score(doc=48270,freq=1.0 = termFreq=1.0 ),
product of:
0.22236869 = queryWeight, product of:
4.8295836 = idf(docFreq=1867, maxDocs=86009)
0.046043035 = queryNorm
2.112943 = fieldWeight in 48270, product of:
1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1.0
4.8295836 = idf(docFreq=1867, maxDocs=86009)
0.4375 = fieldNorm(doc=48270)
0.5 = coord(1/2)
</str>
Regards,
DQ
-----Original Message-----
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: 19 February 2013 15:31
To: solr-user@lucene.apache.org
Subject: Re: Edismax odd results
Show us your qf and pf params. Do you have PF2 set? That's the key for
getting the phrase "engineer boots" boosted higher than just boots. You
may also simply have to give a higher PF2 boost since "boots" probably has
a much higher term frequency than "engineer" or even the natural Lucene
score for "engineer boot".
Also check the &debugQuery=true "explain" scoring to see how engineer,
boot, and "engineer boot" are being scored - you may have to add some
specific query phrases to force "engineer boot" into the top results to
comparing the scoring.
-- Jack Krupansky
-----Original Message-----
From: David Quarterman
Sent: Tuesday, February 19, 2013 6:21 AM
To: solr-user@lucene.apache.org
Subject: Edismax odd results
Hi all,
We have an index of boots which contains harness boots, engineer boots ,
ankle boots, etc. An edismax search on the index for 'harness boots'
brings back 2,175 boots with 'harness' results at the top. 'Searching
'engineer boots' brings back everything but 'engineer boots', same for
'ankle boots' - in fact, same result set of 1,873 mostly boots but a few
other products mixed in.
We're on SOLR 4.0 and the field we're querying is stemmed (snowball),
lowercased on WhiteSpaceTokenizer. Any ideas?