Re: analyzer type="query" with NGramTokenFilterFactory forces phrase query

Robert Muir Mon, 18 Jan 2010 10:06:59 -0800

the way that queryparser treats whitespace is also a problem for
languages that have words that contain spaces, like vietnamese.
i think it also causes grief for multi-word synonyms, such that they
don't work correctly at querytime:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#SynonymFilter


2010/1/18 Wangsheng Mei <hairr...@gmail.com>:
> I faced a similar problem when I was dealing with Chinese words search.
> By simply adding a PositionFilter at the end of analyzer, the damn phrase
> query disappeared  and replaced by term queries which is what I've expected.
> That's very nice, thank you very much!
>
> Note that Chinese words segmentation is very different from English words
> segmentation in that the latter use a whitespace as the delimiter.
> So if I search "中国汉字", solr(lucene) will treat is as a phrase search because
> it doesn't see any whitespace within the query string.But in fact, it should
> be considered as BooleanQuery(OR) with two term queries search in this case.
> Anyway, I am confused by solr(lucene)'s behavior on this. Is it a bug?
>
> 2010/1/1 AHMET ARSLAN <iori...@yahoo.com>
>
>> > "if this is the expected behaviour is
>> > there a way to override it?"[1]
>> >
>> > [1] me
>>
>>
>> Using PositionFilterFactory[1] after NGramFilterFactory can yield parsed
>> query:
>>
>> field:fa field:am field:mi field:il field:ly field:fam field:ami field:mil
>> field:ily
>>
>> [1]
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory
>>
>>
>>
>>
>
>
> --
> 梅旺生
>



-- 
Robert Muir
rcm...@gmail.com

Re: analyzer type="query" with NGramTokenFilterFactory forces phrase query

Reply via email to