I am seeing this in several of my fields. I have something like "Samsung
X150" or "Nokia BH-212". And my query will not match on X150 or BH-212.
So, my query is something like +model:(Samsung X150). Through debugQuery, I see
that this gets converted to +(model:samsung model:"x 150"). It
matches on Samsung, but not X150. A simple query like model:BH-212
simply fails. model:BH212 also fails. The only query that seems to work
is model:(BH 212).
Here is the schema for that field:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100" >
<analyzer type="index">
<tokenizer
class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="true" />
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="1" />
<filter
class="solr.LowerCaseFilterFactory" />
<filter
class="com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory"
protected="protwords.txt" />
<filter
class="solr.RemoveDuplicatesTokenFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer
class="solr.WhitespaceTokenizerFactory" />
<filter
class="solr.SynonymFilterFactory" synonyms="query_synonyms.txt"
ignoreCase="true" expand="true" />
<filter
class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
/>
<filter class="solr.WordDelimiterFilterFactory"
splitOnCaseChange="1" generateWordParts="1" generateNumberParts="1"
catenateWords="0" catenateNumbers="0" catenateAll="0" />
<filter class="solr.LowerCaseFilterFactory" />
<filter
class="com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory"
protected="protwords.txt" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
</analyzer>
</fieldType>
<field
name="model" type="text" indexed="true" stored="true" omitNorms="true"
omitTermFreqAndPositions="true" />
Any ideas? According to the analyzer, I would expect the phrase "BH-212" to
match on "bh" and
"212". Or am I missing something?
Also, is there anyway to tell the parser to not convert "X150" into a phrase
query. I have some cases when it would be more useful to turn it into +(X 150).