The pattern you are using in the PatternTokenizerFactory does not
contain double quotes, so indexing the text "The Promulgation of
Universal Peace" will results in the following tokens : "The /
Promulgation / of / Universal / Peace", that's why Peace will not match
Peace".
On 02/26/2013 08:08 AM, Alex Cougarman wrote:
Hi. We have run into an interesting situation when searching for words that are
within double-quotes in our documents. For example, when we enter the following
search: promulgation AND peace
The document in question has this text exactly (with the double quotes): "The
Promulgation of Universal Peace"
However, it finds and highlights the word Promulgation but not the word Peace
Here's the field's definition in our schema.xml:
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.PatternTokenizerFactory"
pattern="[\s\.\?\!,:;]"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.PatternTokenizerFactory"
pattern="[\s\.\?\!,:;]"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
Warm regards,
Alex Cougarman
Bahá'í World Centre
Haifa, Israel
Office: +972-4-835-8683
Cell: +972-54-241-4742
acoug...@bwc.org<mailto:acoug...@bwc.org>
--
Oussama Jilal