Multifield query parser and phrase query behaviour from 1.3 to 1.4

Jérôme Etévé Tue, 27 Oct 2009 05:45:15 -0700

Hi All,
 I'm using a multified query parser to generated weighted queries
across different fields.


For instance, perl developer gives me:
+(title:perl^10.0 keywords:perl company:perl^3.0)
+(title:developer^10.0 keywords:developer company:developer^3.0)

Either in solr 1.3 or solr 1.4 (from 12 oct 2009), a query like
"d'affaire" gives me:
title:"d affaire"^10.0 keywords:"d affaire" company:"d affaire"^3.0

nb: "d" is not a stopword

That's the first thing I don't get, since "d'affaire" is parsed as two
separate tokens 'd' and 'affaire' , why these phrase queries appear?

When I use the analysis interface of solr, "d'affaire" gives (for
query or indexing, since the analyzer is the same):
term position   1       2
term text       d       affaire
term type       word    word
source start,end        0,1     2,9

You can't see it in this email, but 'd' and 'affaire' are both purple,
indicating a match with the query tokens.

I don't really get why these two tokens are subsequently put together
in a phrase query.

In solr 1.3, it didn't seem to be a problem though. title:"d affaire"
matches document where title contains "d'affaire" and all is fine.
That's the behaviour we should expect since the title field uses
exactly the same analyzer at index and query time.

Since I'm using solr 1.4, title:"d affaire" does not give any results back.

Is there any behaviour change that could be responsible for this, and
what's the correct way to fix this?

Thanks for your help.

Jerome.

-- 
Jerome Eteve.
http://www.eteve.net
jer...@eteve.net

Multifield query parser and phrase query behaviour from 1.3 to 1.4

Reply via email to