Hi,
I have a question regarding phrase search in combination with a
WordDelimiterGraphFilter (Solr 8.4.1).
Whenever I try to search using a phrase where token combination consists
of delimited and non-delimited tokens, I don't get any matches.
This is the configuration:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="0"
catenateAll="0"
splitOnCaseChange="1"
preserveOriginal="1"/>
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<field name="text" type="text" indexed="true" stored="true"
omitTermFreqAndPositions="false" />
Example document:
{
id: '1',
text: 'mr. i.n.i.t. firstsirname secondsirname'
}
Queries and results:
Query:
"mr. i.n.i.t. firstsirname"
-----
No result
Query:
"mr. i.n.i.t."
-----
Result
Query:
"mr. i n i t"
-----
Result
Query:
"mr. init"
-----
Result
Query:
"mr init"
-----
Result
Query:
"i.n.i.t. firstsirname"
-----
No result
Query:
"init firstsirname"
-----
No result
Query:
"i.n.i.t. firstsirname secondsirname"
-----
No result
Query:
"init firstsirname secondsirname"
-----
No result
I don't quite understand why this is. When looking at the results of the
analyzers I don't understand why it's working with just delimited or
non-delimited tokens. However, as soon as the mixed combination of
delimited and non-delimited is searched, there is no match.
Could someone explain? And is there a solution to make it work?
Best regards,
Jeroen