It doesn't make sense to spell check individual character sized words,
but makes a lot of sense for phrases. Due to pervasive use of pinyin
IM, it's very easy to write phrases that are totally wrong in
semantics and but "sounds" correct. n-gram should work if it doesn't
mangle the characters.

On Tue, Apr 12, 2011 at 12:47 PM, Otis Gospodnetic
<otis_gospodne...@yahoo.com> wrote:
> Hi,
>
> Does spellchecking in Chinese actually make sense?  I once asked a native
> Chinese speaker about that and the person told me it didn't really make sense.
> Anyhow, with n-grams, I don't think this could technically work even if it 
> made
> sense for Chinese, could it?
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> ----- Original Message ----
>> From: alexw <aw...@crossview.com>
>> To: solr-user@lucene.apache.org
>> Sent: Tue, April 12, 2011 3:07:48 PM
>> Subject: Spellchecking in the Chinese Lanugage
>>
>> Hi,
>>
>> I have been trying to get spellcheck to work in the Chinese language.  So far
>> I have not had any luck. Can someone shed some light here as a general  guide
>> line in terms of what need to happen?
>>
>> I am using the CJKAnalyzer  in the text field type and searching works fine,
>> but spelling does not work.  Here are the things I have tried:
>>
>> 1. Put CJKAnalyzer in the "textSpell"  field type.
>> 2. Set the characterEncoding param to "utf-8" in the spellcheck  search
>> component.
>> 3. Using Luke, I can see the Chinese characters in the  "spell" field in the
>> main index.
>> 4. After building the spelling index, I  don't see Chinese characters in the
>> "spellchecker" index, only terms in  English.
>> 5. Tried adding the NGramFilterFactory to the CJKAnalyzer with no  luck
>> either.
>>
>> Thanks!
>>
>>
>> --
>> View this message in context:
>>http://lucene.472066.n3.nabble.com/Spellchecking-in-the-Chinese-Lanugage-tp2812726p2812726.html
>>
>> Sent  from the Solr - User mailing list archive at Nabble.com.
>>
>

Reply via email to