It looks like the code constructing the boost phrase for pf will always add
a trailing blank, which is never a problem when a normal tokenizer is used
that removes white space, but the keyword tokenizer will preserve that
extra space, which prevents an exact match.

See line 531:
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.0/solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java

I'd say it's a bug, but more a narrow use case that wasn't considered or
tested.

-- Jack Krupansky

On Tue, Apr 5, 2016 at 7:50 AM, <jimi.hulleg...@svensktnaringsliv.se> wrote:

> Hi,
>
> I'm trying to boost documents using a phrase field boosting (ie the pf
> parameter for edismax), but I can't get it to work (ie boosting documents
> where the pf field match the query as a phrase).
>
> As far as I can tell, solr, or more specifically the edismax handler, does
> *something* when I add this parameter. I know this because the QTime
> increases from around 5-10ms to around 30-40 ms, and the score explain
> structure is *slightly* modified (though with the same final score for all
> documents). But nowhere in the explain structure can I see anything about
> the pf. And I can't understand that. Shouldn't it be included in the
> explain? If not, is there any way to force it to be included somehow?
>
> The query looks something like this:
>
>
> ?q=some+words&rows=10&sort=score+desc&debugQuery=true&fl=objectid,exactTitle,score%2C%5Bexplain+style%3Dtext%5D&qf=title%5E2&qf=swedishText1%5E1&defType=edismax&pf=exactTitle%5E5&wt=xml&indent=true
>
>
> I have one document that has the title "some words", and when I do a
> simple query filter with exactTitle:"some words" I get a match for that
> document. So then I would expect that the query above would boost this
> document, and include information about this in the explain. But nothing
> like this happens, and I can't understand why.
>
> The field looks like this:
>
> <field name="exactTitle" type="keywordText" indexed="true" stored="true"
> required="false" multiValued="false" />
>
> And the fieldType looks like this:
>
> <fieldType name="keywordText" class="solr.TextField"
> positionIncrementGap="100">
>                          <analyzer>
>                                                   <charFilter
> class="solr.HTMLStripCharFilterFactory" />
>                                                   <tokenizer
> class="solr.KeywordTokenizerFactory" />
>                                                   <filter
> class="solr.LowerCaseFilterFactory" />
>                          </analyzer>
> </fieldType>
>
>
> I have also tried boosting this document using a boost query, ie
> bq=exactTitle:"some words", and this works as expected. The document score
> is boosted, and the explain states this very clearly, with this segment:
>
> [...]
> 9.870669 = (MATCH) weight(exactTitle:some words^5.0 in 12)
> [DefaultSimilarity], result of:
> [...]
>
> Why is this working, but q=some+words&pf=exactTitle^5 not? Shouldn't
> edismax rewrite my "pf query" into something very similar to the "bq query"?
>
> Regards
> /Jimi
>

Reply via email to