(11/12/02 4:20), Aaron Wong wrote:
Hi all,
We're encountering a problem with querying terms with dashes (and other
non-alphanumeric characters). For example, we
use PatternReplaceCharFilterFactory to replace dashes with blank characters
for both index and query, however any terms with dashes in them will not
return any results.
For example:
searching for 'cdka' won't return any results, even though 'cdka-1' should
be indexed.
This is similar the problem posted here (
http://stackoverflow.com/questions/6459695/solr-ngramtokenizerfactory-and-patternreplacecharfilterfactory-analyzer-result)
without a response.
The following is the relevant part of the schema:
---------------------------------------------------------------------------------------------------------
<fieldType name="edge_ngram" class="solr.TextField"
positionIncrementGap="1">
<analyzer type="index">
<charfilter class="solr.PatternReplaceCharFilterFactory"
pattern="-" replacement=""/>
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" splitOnNumerics="0" generateNumberParts="0"
catenateWords="0" catenateNumbers="0" catenateAll="0"
splitOnCaseChange="0"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
maxGramSize="15" side="front" />
</analyzer>
<analyzer type="query">
<charfilter class="solr.PatternReplaceCharFilterFactory"
pattern="-" replacement=""/>
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" splitOnNumerics="0" generateNumberParts="0"
catenateWords="0" catenateNumbers="0" catenateAll="0"
splitOnCaseChange="0"/>
</analyzer>
</fieldType>
<fields>
<field name="names_auto" type="edge_ngram" indexed="true" stored="true"
multiValued="false" />
......
</fields>
I'm not sure if I understand you correctly, but if you would like
PatternReplaceCharFilter
to change "cdka-1" to "cdka 1", I think you need to set replacement=" ", rather than
"".
koji
--
Check out "Query Log Visualizer" for Apache Solr
http://www.rondhuit-demo.com/loganalyzer/loganalyzer.html
http://www.rondhuit.com/en/