And, in fact, you do NOT need to have two. If they are both identical, just specify one analysis chain with no qualifier, i.e. <analyzer>
On Thu, Nov 8, 2012 at 9:44 AM, Jack Krupansky <j...@basetechnology.com>wrote: > Many token filters will be used 100% identically for both "index" and > "query" analysis, but WordDelimiterFilter is a rare exception. The issue is > that at index time it has the ability to generate multiple tokens at the > same position (the "catenate" options), any of which can be queried, but at > query time it can be problematic to have these "extra" terms (except in > some conditions), so the WDF settings suppress generation of the extra > terms. > > Another example is synonyms - generate extra terms at index time for > greater precision of searches, but limit the query terms to exclude the > "extra" terms. > > That's the reason for the occaassional asymmetry between index-time and > query-time analyzers. > > -- Jack Krupansky > > -----Original Message----- From: johnmu...@aol.com > Sent: Wednesday, November 07, 2012 7:13 PM > To: solr-user@lucene.apache.org > Subject: Questions about schema.xml > > > > HI, > > > Can someone help me understand the meaning of <analyzer type="index"> and > <analyzer type="query"> in schema.xml, how they are used and what do I get > back when the values are not the same? > > > For example, given: > > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100" > autoGeneratePhraseQueries="**true"> > <analyzer type="index"> > <tokenizer class="solr.**WhitespaceTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="**true" /> > <filter class="solr.**WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.**LowerCaseFilterFactory"/> > <filter class="solr.**KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.**PorterStemFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.**WhitespaceTokenizerFactory"/> > <filter class="solr.**SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="**true" /> > <filter class="solr.**WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.**LowerCaseFilterFactory"/> > <filter class="solr.**KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.**PorterStemFilterFactory"/> > </analyzer> > </fieldType> > > > If I make the entire content of "index" the same as "query" (or the other > way around) how will that impact my search? And why would I want to not > make those two blocks the same? > > > Thanks!!! > > > -MJ >