The following query is failing:
((Google +))
This is ultimately reduced to 'google' by my analysis chain, but the
following is in my log (3.2.0, but 3.4.0 also fails):
SEVERE: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse '( (Google
+))': Encountered " ")" ") "" at line 1, column 12.
If I change it to 'Google+' or 'Goo+gle' it works.
Below is the fieldType definition. The pattern filter is designed to
strip leading/trailing punctuation characters, but leave any punctuation
in the middle of a term alone. It does affect the plus sign, by
reducing it to a term of length zero. The length filter then removes it
at the end. In the 'Google+' variant, the pattern filter simply strips
that character off and the query does not fail. Am I seeing a bug here,
or problems with my fieldType?
<fieldType name="genText" class="solr.TextField" sortMissingLast="true"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$"
replacement="$2"
allowempty="false"
/>
<filter class="solr.WordDelimiterFilterFactory"
splitOnCaseChange="1"
splitOnNumerics="1"
stemEnglishPossessive="1"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
preserveOriginal="1"
/>
<filter class="solr.ICUFoldingFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="1" max="512"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$"
replacement="$2"
allowempty="false"
/>
<filter class="solr.WordDelimiterFilterFactory"
splitOnCaseChange="1"
splitOnNumerics="1"
stemEnglishPossessive="1"
generateWordParts="1"
generateNumberParts="1"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
preserveOriginal="1"
/>
<filter class="solr.ICUFoldingFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="1" max="512"/>
</analyzer>
</fieldType>