I have a field defined as: <field name="content" type="text" indexed="true" stored="false" termVectors="true" multiValued="true" /> where "text" is unmodified from the schema.xml example that came with Solr 1.4.1.
I have documents with some compound words indexed, words like Sandstone. And in several cases words that are camel case like MaxSize. If I query using all lower case, sandstone or maxsize, I get the documents I expect. If I query with proper case, ie. Sandstone or Maxsize I get the documents I expect. However, if the query is camel case, MaxSize or SandStone, it doesn't find the documents. In the case of MaxSize it is particularly frustrating because that is the actual case of the word that was indexed. Is this expected behavior? The query analyzer definition the the "text" field type is: <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" ignoreCase="true" expand="true" synonyms="synonyms.txt"/> <filter class="solr.StopFilterFactory" enablePositionIncrements="true" words="stopwords.txt" ignoreCase="true"/> <filter class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1" catenateAll="0" catenateNumbers="0" catenateWords="0" generateNumberParts="1" generateWordParts="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter language="English" class="solr.SnowballPorterFilterFactory" protected="protwords.txt"/> </analyzer> Is the order by the filters important? If LowerCaseFilterFactory came before WordDelimiterFilterFactory, would that fix this? Would it break something else? Thanks, Ken -- View this message in context: http://lucene.472066.n3.nabble.com/Compound-word-search-not-what-I-expected-tp3036089p3036089.html Sent from the Solr - User mailing list archive at Nabble.com.