RE: Tokenizer and Filter Factory to index Chinese characters

2015-07-07 Thread Markus Jelsma
Yes, but it is a small change :) M. -Original message- > From:Zheng Lin Edwin Yeo > Sent: Tuesday 7th July 2015 4:50 > To: solr-user@lucene.apache.org > Subject: Re: Tokenizer and Filter Factory to index Chinese characters > > So we have to recompile the analyser

Re: Tokenizer and Filter Factory to index Chinese characters

2015-07-06 Thread Zheng Lin Edwin Yeo
heng Lin Edwin Yeo > > Sent: Monday 6th July 2015 12:31 > > To: solr-user@lucene.apache.org > > Subject: Re: Tokenizer and Filter Factory to index Chinese characters > > > > Yes, I tried that also, but I faced some compatibility issues with Solr > > 5.2.1, as the

RE: Tokenizer and Filter Factory to index Chinese characters

2015-07-06 Thread Markus Jelsma
Yes, analyzers slightly changed since 5.x. https://issues.apache.org/jira/browse/LUCENE-5388 -Original message- > From:Zheng Lin Edwin Yeo > Sent: Monday 6th July 2015 12:31 > To: solr-user@lucene.apache.org > Subject: Re: Tokenizer and Filter Factory to index Chines

Re: Tokenizer and Filter Factory to index Chinese characters

2015-07-06 Thread Zheng Lin Edwin Yeo
; > > > "chinese4":{ > > > > "text":["户只要订购《联合晚报》任一种配套,就可选择下列其中一项赠品带回家。 \n 签订两年配套的读者可获得一台价值 > > 199元的Lenovo TAB 2 > A7-10七寸平板电脑,或者一架价值249元的Philips > > Viva"]}, > > > > "chinese5":{ > > > >

Re: Tokenizer and Filter Factory to index Chinese characters

2015-07-06 Thread davidphilip cherian
gt; > "content":["结束连续两个月的萎缩,但比经济师普遍预估的增长3.3%疲软得多。这也意味着,我国今年第一季度的经济很可能让人失望 > > > > > \n "], > > > > > "author":["Edwin"]}, > > > > > "chinese2":{ > > > > > "id":["chinese2"

Re: Tokenizer and Filter Factory to index Chinese characters

2015-07-05 Thread Zheng Lin Edwin Yeo
"chinese5":{ "text":["Zheng Lin Yeo"]}}} Why is this so? Regards, Edwin 2015-06-25 18:54 GMT+08:00 Markus Jelsma : > You may also want to try Paoding if you have enough time to spend: > https://github.com/cslinmiso/paoding-analysis > > -Origi

RE: Tokenizer and Filter Factory to index Chinese characters

2015-06-25 Thread Markus Jelsma
”幸运抽奖"], > "author":["Edwin"]}}} > > > Regards, > Edwin > > > 2015-06-25 17:28 GMT+08:00 Markus Jelsma : > > > Hi - we are actually using some other filters for Chinese, although they > > are not specialized for Chinese: &

Re: Tokenizer and Filter Factory to index Chinese characters

2015-06-25 Thread Zheng Lin Edwin Yeo
此外,一年一度的晚报保健美容展,将在本月23日和24日,在新达新加坡会展中心401、402展厅举行。 \n 现场将开设《联合晚报》订阅展摊,读者当场订阅晚报,除了可获得丰厚的赠品,还有机会参与“必胜”幸运抽奖"], "author":["Edwin"]}}} Regards, Edwin 2015-06-25 17:28 GMT+08:00 Markus Jelsma : > Hi - we are actually using some other filters for Chinese, although they >

RE: Tokenizer and Filter Factory to index Chinese characters

2015-06-25 Thread Markus Jelsma
Subject: Re: Tokenizer and Filter Factory to index Chinese characters > > Thank you. > > I've tried that, but when I do a search, it's returning much more > highlighted results that what it supposed to. > > For example, if I enter the following query: > http://localhos

Re: Tokenizer and Filter Factory to index Chinese characters

2015-06-25 Thread Zheng Lin Edwin Yeo
tory, but there's no improvement in the search results. Regards, Edwin On 25 June 2015 at 17:17, Markus Jelsma wrote: > Hello - you can use HMMChineseTokenizerFactory instead. > > http://lucene.apache.org/core/5_2_0/analyzers-smartcn/org/apache/lucene/analysis/cn/smart/HMMChineseTokenizerFactory.ht

RE: Tokenizer and Filter Factory to index Chinese characters

2015-06-25 Thread Markus Jelsma
solr-user@lucene.apache.org > Subject: Tokenizer and Filter Factory to index Chinese characters > > Hi, > > Does anyone knows what is the correct replacement for these 2 tokenizer and > filter factory to index chinese into Solr? > - SmartChineseSentenceTokenizerFactory > - SmartChi

Tokenizer and Filter Factory to index Chinese characters

2015-06-25 Thread Zheng Lin Edwin Yeo
Hi, Does anyone knows what is the correct replacement for these 2 tokenizer and filter factory to index chinese into Solr? - SmartChineseSentenceTokenizerFactory - SmartChineseWordTokenFilterFactory I understand that these 2 tokenizer and filter factory are already deprecated in Solr 5.1, but I c