Thanks Otis and Luke.
Yes it does make sense to spellcheck phrases in Chinese. Looks like the
default Solr spellCheck component is already doing some kind of NGram-ing.
When examining the spellCheck index, I did see gram1, gram2, gram3, gram4...
The problem is no Chinese terms were indexed into th
It doesn't make sense to spell check individual character sized words,
but makes a lot of sense for phrases. Due to pervasive use of pinyin
IM, it's very easy to write phrases that are totally wrong in
semantics and but "sounds" correct. n-gram should work if it doesn't
mangle the characters.
On T
Hi,
Does spellchecking in Chinese actually make sense? I once asked a native
Chinese speaker about that and the person told me it didn't really make sense.
Anyhow, with n-grams, I don't think this could technically work even if it made
sense for Chinese, could it?
Otis
Sematext :: http://