Actually here is the difference between the textgen analysis pipeline and our:

For the phrase "ingenieur d'affaire senior" ,
Our pipeline gives right after our tokenizer:

term position   1       2       3       4
term text       ingenieur       d       affaire senior

'd' and 'affaire' are separated as different tokens straight away. Our
filters have no later effect for this phrase.

* The textgen pipeline uses a whitespace tokenizer, so it gives first:
term position   1       2       3
term text       ingenieur       d'affaire       senior
term type       word    word    word
source start,end        0,9     10,19   20,26

* Then a word delimiter filter splits the token "d'affaire" (and
generate the concatenation):
erm position    1       2       3       4
term text       ingenieur       d       affaire senior
daffaire
term type       word    word    word    word
word
source start,end        0,9     10,11   12,19   20,26
10,19


Could you see a reason why title:"d affaire" works with textgen but
not with our type?

Thanks!

Jerome.


2009/10/27 Jérôme Etévé <jerome.et...@gmail.com>:
> Hum,
>  That's probably because of our own customized types/tokenizers/filters.
>
> I tried reindexing and querying our data using the default solr type
> 'textgen' and it works fine.
>
> I need to investigate which features of the new lucene 2.9 API is not
> implemented in our own tokenizers etc...
>
> Thanks.
>
> Jerome.
>
> 2009/10/27 Yonik Seeley <yo...@lucidimagination.com>:
>> On Tue, Oct 27, 2009 at 8:44 AM, Jérôme Etévé <jerome.et...@gmail.com> wrote:
>>> I don't really get why these two tokens are subsequently put together
>>> in a phrase query.
>>
>> That's the way the Lucene query parser has always worked... phrase
>> queries are made if multiple tokens are produced from one field query.
>>
>>> In solr 1.3, it didn't seem to be a problem though. title:"d affaire"
>>> matches document where title contains "d'affaire" and all is fine.
>>
>> This should not have changed between 1.3 and 1.4...
>> What's the fieldType and it's definition for your title field?
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>
>
>
> --
> Jerome Eteve.
> http://www.eteve.net
> jer...@eteve.net
>



-- 
Jerome Eteve.
http://www.eteve.net
jer...@eteve.net

Reply via email to