Query with plus sign failing

Shawn Heisey Thu, 29 Sep 2011 09:32:29 -0700

The following query is failing:

((Google +))

This is ultimately reduced to 'google' by my analysis chain, but thefollowing is in my log (3.2.0, but 3.4.0 also fails):

SEVERE: org.apache.solr.common.SolrException:org.apache.lucene.queryParser.ParseException: Cannot parse '( (Google+))': Encountered " ")" ") "" at line 1, column 12.


If I change it to 'Google+' or 'Goo+gle' it works.

Below is the fieldType definition. The pattern filter is designed tostrip leading/trailing punctuation characters, but leave any punctuationin the middle of a term alone. It does affect the plus sign, byreducing it to a term of length zero. The length filter then removes itat the end. In the 'Google+' variant, the pattern filter simply stripsthat character off and the query does not fail. Am I seeing a bug here,or problems with my fieldType?

<fieldType name="genText" class="solr.TextField" sortMissingLast="true"positionIncrementGap="100">

<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
          pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$"
          replacement="$2"
          allowempty="false"
        />
<filter class="solr.WordDelimiterFilterFactory"
          splitOnCaseChange="1"
          splitOnNumerics="1"
          stemEnglishPossessive="1"
          generateWordParts="1"
          generateNumberParts="1"
          catenateWords="1"
          catenateNumbers="1"
          catenateAll="0"
          preserveOriginal="1"
        />
<filter class="solr.ICUFoldingFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="1" max="512"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
          pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$"
          replacement="$2"
          allowempty="false"
        />
<filter class="solr.WordDelimiterFilterFactory"
          splitOnCaseChange="1"
          splitOnNumerics="1"
          stemEnglishPossessive="1"
          generateWordParts="1"
          generateNumberParts="1"
          catenateWords="0"
          catenateNumbers="0"
          catenateAll="0"
          preserveOriginal="1"
        />
<filter class="solr.ICUFoldingFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="1" max="512"/>
</analyzer>
</fieldType>

Query with plus sign failing

Reply via email to