Hi Ken, Indeed, I want to support function like phonetic (pinyin or zhuyin) search, not soundex (sorry and thanks correct me).
any further idea? Floyd 2011/10/20 Ken Krugler <kkrugler_li...@transpac.com>: >> Wow, interesting question. Can soundex even be applied to a language like >> Chinese, which is tonal and doesn't have individual letters, but whole >> characters? I'm no expert, but intuitively speaking it sounds hard or maybe >> even impossible... > > The only two cases I can think of are: > > - Cases where you have two (or more) characters that are variant forms. > Unicode tried to unify all of these, but some still exist. And in GB 18030 > there are tons. > > - If you wanted to support phonetic (pinyin or zhuyin) search, then you > might want to collapse syllables that are commonly confused. But then of > course you'd have to be storing the phonetic forms for all of the words. > > -- Ken > > >>> From: Floyd Wu <floyd...@gmail.com> >>> To: solr-user@lucene.apache.org >>> Sent: Thursday, October 20, 2011 5:43 AM >>> Subject: Does anybody has experience in Chinese soundex(sounds like) of >>> SOLR? >>> >>> Hi there, >>> >>> There are many English soundex implementation can be referenced, but I >>> wonder how to do Chinese soundex(sounds like) filter (maybe). >>> >>> any idea? >>> >>> Floyd >>> >>> >>> > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://bixolabs.com > custom big data solutions & training > Hadoop, Cascading, Mahout & Solr > > > >