Hi Erick,

I understand the wildcard issue -  that was more desperation on our part than 
logic!

TermsComponent showed 
<lst name="prodnameplurals">
        <int name="engineering:">222</int>
        <int name="engineer">197</int>
</lst>
so the term is in the index.
Using the explainOther, I can see that the relevance of documents with 
'engineer boots' in the name is low compared to the others and they appear 
randomly distributed through the resultset (I know it's not random). We've 
tried all sorts of things to boost them but to no avail. Trying 'logger boots' 
or 'harness boots' gives good results with the required terms at the top of the 
set.

I'm mystified.

Regards,

DQ

-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 20 February 2013 12:49
To: solr-user@lucene.apache.org
Subject: Re: Edismax odd results

OK, first:
wildcarding and stemming don't get along well together. Since you've stemmed 
the field, enginee* would not match the stemmed term engin. This is actually 
pretty tricky to try to implement. For instance, how would enginee stem? So the 
fqs you posted are going to mislead you in that regard.

If you want to examine the actual values in your index, consider using 
TermsComponent or Luke. Either will show you exactly what's being searched 
against.

I suspect that your fq entries (as typed) are going against the default field 
of "text" as defined in your schema, which doesn't stem, so that's leading you 
astray possibly.

Finally, you may be getting bitten by scoring, field norms and all that. If you 
have a doc ID that you _know_ contains "engineers boots", try using debug with 
explainOther (
http://wiki.apache.org/solr/CommonQueryParameters#explainOther) which might 
help you understand what's happening with the doc you care about....

Best
Erick


On Wed, Feb 20, 2013 at 7:13 AM, David Quarterman <da...@corexe.com> wrote:

> Hi Erick,
>
> Debug=all posted on http://justpaste.it/davidqhogdebug. Can't see 
> anything obvious myself....but then I'm not an expert!
>
> Regards,
>
> DQ
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: 20 February 2013 02:02
> To: solr-user@lucene.apache.org
> Subject: Re: Edismax odd results
>
> When you get back to this tomorrow, also try and paste the parsed 
> query bits you get back when you append &debug=all. Sometimes it's 
> surprising what the parsed query _really_ looks like....
>
> Best
> Erick
>
>
> On Tue, Feb 19, 2013 at 3:13 PM, David Quarterman <da...@corexe.com>
> wrote:
>
> > Hi Shawn,
> >
> > Now finished for the day but will post the schema tomorrow. Thanks 
> > for the help (and Jack too).
> >
> > Regards,
> >
> > DQ
> >
> > P.S. did reindex after changing schema and the analyzer/query stuff 
> > matches precisely!!
> >
> > Shawn Heisey <s...@elyograg.org> wrote:
> >
> > On 2/19/2013 11:16 AM, David Quarterman wrote:
> > > This is definitely driving us mad now! Changed to PorterStemming 
> > > and
> > there's very little difference.
> > >
> > > If we add fq=engineer, we get 0 results. Add fq=engineer* and we 
> > > get the
> > 90 in the system. Try with fq=ankle* and we get 2. Correct. Try with
> > fq=harness* and we get 0!
> > >
> > > The stemming reduces 'engineer' to 'engin' so I'd have expected a 
> > > lot
> > more results.
> > >
> > > Anyone got any ideas?
> >
> > Did you completely reindex when you changed your schema?  You must
> reindex.
> >
> > Does the index analysis match the query analysis?  Some specific 
> > differences are allowed (and sometimes encouraged), but stemming 
> > must be done to both.  Can you share your schema?  Use a paste 
> > website like pastie.org for that.
> >
> > Thanks,
> > Shawn
> >
> >
>

Reply via email to