The pattern you are using in the PatternTokenizerFactory does not contain double quotes, so indexing the text "The Promulgation of Universal Peace" will results in the following tokens : "The / Promulgation / of / Universal / Peace", that's why Peace will not match Peace".

On 02/26/2013 08:08 AM, Alex Cougarman wrote:
Hi. We have run into an interesting situation when searching for words that are 
within double-quotes in our documents. For example, when we enter the following 
search: promulgation AND peace

The document in question has this text exactly (with the double quotes): "The 
Promulgation of Universal Peace"
However, it finds and highlights the word Promulgation but not the word Peace
Here's the field's definition in our schema.xml:

     <fieldType name="text_general" class="solr.TextField" 
positionIncrementGap="100">
       <analyzer type="index">
         <tokenizer class="solr.PatternTokenizerFactory" 
pattern="[\s\.\?\!,:;]"/>
         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" 
enablePositionIncrements="true" />
         <filter class="solr.LowerCaseFilterFactory"/>
         <filter class="solr.PorterStemFilterFactory"/>
       </analyzer>
       <analyzer type="query">
         <tokenizer class="solr.PatternTokenizerFactory" 
pattern="[\s\.\?\!,:;]"/>
         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" 
enablePositionIncrements="true" />
         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" 
expand="true"/>
         <filter class="solr.LowerCaseFilterFactory"/>
         <filter class="solr.PorterStemFilterFactory"/>
       </analyzer>
     </fieldType>

Warm regards,
Alex Cougarman

Bahá'í World Centre
Haifa, Israel
Office: +972-4-835-8683
Cell: +972-54-241-4742
acoug...@bwc.org<mailto:acoug...@bwc.org>



--
Oussama Jilal

Reply via email to