the way that queryparser treats whitespace is also a problem for languages that have words that contain spaces, like vietnamese. i think it also causes grief for multi-word synonyms, such that they don't work correctly at querytime: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#SynonymFilter
2010/1/18 Wangsheng Mei <hairr...@gmail.com>: > I faced a similar problem when I was dealing with Chinese words search. > By simply adding a PositionFilter at the end of analyzer, the damn phrase > query disappeared and replaced by term queries which is what I've expected. > That's very nice, thank you very much! > > Note that Chinese words segmentation is very different from English words > segmentation in that the latter use a whitespace as the delimiter. > So if I search "中国汉字", solr(lucene) will treat is as a phrase search because > it doesn't see any whitespace within the query string.But in fact, it should > be considered as BooleanQuery(OR) with two term queries search in this case. > Anyway, I am confused by solr(lucene)'s behavior on this. Is it a bug? > > 2010/1/1 AHMET ARSLAN <iori...@yahoo.com> > >> > "if this is the expected behaviour is >> > there a way to override it?"[1] >> > >> > [1] me >> >> >> Using PositionFilterFactory[1] after NGramFilterFactory can yield parsed >> query: >> >> field:fa field:am field:mi field:il field:ly field:fam field:ami field:mil >> field:ily >> >> [1] >> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory >> >> >> >> > > > -- > 梅旺生 > -- Robert Muir rcm...@gmail.com