Re: A bug of Chinese synonym

jnduan Tue, 26 Feb 2013 07:19:12 -0800

hi,liwei
I think you'd better ask questions in english,or most people here may not 
understand what you ask.
I'm confused with the 
class:cn.antvision.eagleattack.nest.analyzer.CIKTokenizerFactory.What does it 
exactly do?What extra functions do you add in this class?
If you are use IKAnalyzer's default implementation,it should parse '北京市动物园' to 
'北京市' and '动物园' .
And how did you configure file the synonyms.txt?Please paste the content ,so we 
can find out what's wrong going on.


Best regards.

Jienan Duan

在 2013-2-26，下午4:57，liwei  <chidaweili...@163.com> 写道：

> 大家好，
> 我在使用solr的时候遇到一个中文同义词的问题。比如说“北京”与“北京市”是同义词，如果搜索“北京市动物园”，那么我希望搜到包含“+北京市 
> +动物园”或者“+北京 +动物园”的文档，但是QueryParser却把它解析成了“+北京市 +北京 +动物园”。
> 如果加个空格搜索“北京市  动物园”，那么能搜到我期望的文档：包含“动物园”，且包含“北京市”或者“北京”。
> 
> 我把搜索时的debug info 贴在下面了，大家有没有遇到过相同的问题，请教如何解决？
> 
> //搜索“北京市动物园”的debug info
> <str name="rawquerystring">北京市动物园</str>
> <str name="querystring">北京市动物园</str>
> <str name="parsedquery">+content:北京市 +content:北京 +content:动物园</str>
> <str name="parsedquery_toString">+content:北京市 +content:北京 +content:动物园</str>
> <lst name="explain"/>
> <str name="QParser">LuceneQParser</str>
> 
> 
> // 搜索“北京市 动物园”的debug info
> <str name="rawquerystring">北京市 动物园</str>
> <str name="querystring">北京市 动物园</str>
> <str name="parsedquery">+((content:北京市 content:北京)/no_coord) 
> +content:动物园</str>
> <str name="parsedquery_toString">+(content:北京市 content:北京) +content:动物园</str>
> <lst name="explain"/>
> <str name="QParser">LuceneQParser</str>
> 
> 下面是中文分析器的config，使用了IK分词器的一个子类。
> <fieldTypename="textgen"class="solr.TextField"positionIncrementGap="100"><analyzertype="index"><tokenizerclass="cn.antvision.eagleattack.nest.analyzer.CIKTokenizerFactory"isMaxWordLength="true"/><filterclass="solr.StopFilterFactory"ignoreCase="true"words="Stop_Words.txt"enablePositionIncrements="true"
>  
> /><filterclass="solr.LowerCaseFilterFactory"/></analyzer><analyzertype="query"><tokenizerclass="cn.antvision.eagleattack.nest.analyzer.CIKTokenizerFactory"isMaxWordLength="true"/><filterclass="solr.StopFilterFactory"ignoreCase="true"words="Stop_Words.txt"enablePositionIncrements="true"
>  
> /><filterclass="solr.SynonymFilterFactory"synonyms="synonyms.txt"ignoreCase="true"expand="true"/><filterclass="solr.LowerCaseFilterFactory"/></analyzer></fieldType>
> 
>

Re: A bug of Chinese synonym

Reply via email to