Re: Indexing documents in Chinese

2015-06-10 Thread Zheng Lin Edwin Yeo
I've tried to use solr.HMMChineseTokenizerFactory with the following configurations: It is able to be indexed, but when I tried to search for the words, it matches many more other words and not just the words that I search. Why is this so? For example, the query ht

Re: Indexing documents in Chinese

2015-06-09 Thread Alexandre Rafalovitch
You may find the series of article on CJK analysis/search helpful: http://discovery-grindstone.blogspot.com.au/ It's a little out of date, but should be a very solid intro. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 10 J