Actually here is the difference between the textgen analysis pipeline and our:
For the phrase "ingenieur d'affaire senior" , Our pipeline gives right after our tokenizer: term position 1 2 3 4 term text ingenieur d affaire senior 'd' and 'affaire' are separated as different tokens straight away. Our filters have no later effect for this phrase. * The textgen pipeline uses a whitespace tokenizer, so it gives first: term position 1 2 3 term text ingenieur d'affaire senior term type word word word source start,end 0,9 10,19 20,26 * Then a word delimiter filter splits the token "d'affaire" (and generate the concatenation): erm position 1 2 3 4 term text ingenieur d affaire senior daffaire term type word word word word word source start,end 0,9 10,11 12,19 20,26 10,19 Could you see a reason why title:"d affaire" works with textgen but not with our type? Thanks! Jerome. 2009/10/27 Jérôme Etévé <jerome.et...@gmail.com>: > Hum, > That's probably because of our own customized types/tokenizers/filters. > > I tried reindexing and querying our data using the default solr type > 'textgen' and it works fine. > > I need to investigate which features of the new lucene 2.9 API is not > implemented in our own tokenizers etc... > > Thanks. > > Jerome. > > 2009/10/27 Yonik Seeley <yo...@lucidimagination.com>: >> On Tue, Oct 27, 2009 at 8:44 AM, Jérôme Etévé <jerome.et...@gmail.com> wrote: >>> I don't really get why these two tokens are subsequently put together >>> in a phrase query. >> >> That's the way the Lucene query parser has always worked... phrase >> queries are made if multiple tokens are produced from one field query. >> >>> In solr 1.3, it didn't seem to be a problem though. title:"d affaire" >>> matches document where title contains "d'affaire" and all is fine. >> >> This should not have changed between 1.3 and 1.4... >> What's the fieldType and it's definition for your title field? >> >> -Yonik >> http://www.lucidimagination.com >> > > > > -- > Jerome Eteve. > http://www.eteve.net > jer...@eteve.net > -- Jerome Eteve. http://www.eteve.net jer...@eteve.net