So if I don't put preserveOriginal=1 in my WordDelimiterFilterFactory settings
I cannot get a match between AppleTV on the indexing side and appletv on the
search side. Without that setting the all lowercase version of AppleTV is in
term position two due to the catenateWords=1 or the catenateAll=1 settings. I
am surprised. How does term position affect searching? Here is my analysis
with preserveOriginal=1 to make the lower case occur in both term position 1
and 2:
Index Analyzer
org.apache.solr.analysis.WhitespaceTokenizerFactory {}
term position 1
term text AppleTV
term type word
source start,end 0,7
payload
org.apache.solr.analysis.SynonymFilterFactory {synonyms=index_synonyms.txt,
expand=true, ignoreCase=true}
term position 1
term text AppleTV
term type word
source start,end 0,7
payload
org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true}
term position 1
term text AppleTV
term type word
source start,end 0,7
payload
org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=1,
generateNumberParts=1, catenateWords=1, generateWordParts=1, catenateAll=1,
catenateNumbers=1}
term position 1 2
term text AppleTV TV
Apple AppleTV
term type word word
word word
source start,end 0,7 5,7
0,5 0,7
payload
org.apache.solr.analysis.LowerCaseFilterFactory {}
term position 1 2
term text appletv tv
apple appletv
term type word word
word word
source start,end 0,7 5,7
0,5 0,7
payload
com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory
{protected=protwords.txt}
term position 1 2
term text appletv tv
apple appletv
term type word word
word word
source start,end 0,7 5,7
0,5 0,7
payload
org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}
term position 1 2
term text appletv tv
apple appletv
term type word word
word word
source start,end 0,7 5,7
0,5 0,7
payload
Query Analyzer
org.apache.solr.analysis.WhitespaceTokenizerFactory {}
term position 1
term text appletv
term type word
source start,end 0,7
payload
org.apache.solr.analysis.SynonymFilterFactory {synonyms=query_synonyms.txt,
expand=true, ignoreCase=true}
term position 1
term text appletv
term type word
source start,end 0,7
payload
org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true}
term position 1
term text appletv
term type word
source start,end 0,7
payload
org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=1,
generateNumberParts=1, catenateWords=1, generateWordParts=1, catenateAll=1,
catenateNumbers=1}
term position 1
term text appletv
term type word
source start,end 0,7
payload
org.apache.solr.analysis.LowerCaseFilterFactory {}
term position 1
term text appletv
term type word
source start,end 0,7
payload
com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory
{protected=protwords.txt}
term position 1
term text appletv
term type word
source start,end 0,7
payload
org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}
term position 1
term text appletv
term type word
source start,end 0,7
payload