Jamie, the problem with that is that you cannot do exact matching anymore. For this reason, it is good style to have two fields, to use a query expander such as dismax (prefer exact matches, and less phonetic matches), and to only use that when you sort by score.
hope it helps paul Le 23 mai 2011 à 21:43, Jamie Johnson a écrit : > I am new to solr and am trying to determine the best way to take the text > field type (the one in the example) and add phonetic searches to it. > Currently I have done the following: > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100" > autoGeneratePhraseQueries="true"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.DoubleMetaphoneFilterFactory"/> > <!-- in this example, we will only use synonyms at query time > <filter class="solr.SynonymFilterFactory" > synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> > --> > <!-- Case insensitive stop word removal. > add enablePositionIncrements=true in both the index and query > analyzers to leave a 'gap' for more accurate phrase queries. > --> > <filter class="solr.StopFilterFactory" > ignoreCase="true" > words="stopwords.txt" > enablePositionIncrements="true" > /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.PorterStemFilterFactory"/> > > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.DoubleMetaphoneFilterFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" > ignoreCase="true" > words="stopwords.txt" > enablePositionIncrements="true" > /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.PorterStemFilterFactory"/> > </analyzer> > </fieldType> > > which seems to work. Is this appropriate or is there a better way of doing > this? I had previously defined a custom phonetic field but that would mean > for each field which I wanted to support a phonetic style search I would > need to add an additional field. Adding it to the text seemed much more > elegant since it would work for all text fields. Is there a reason not to > do this (i.e. performance, index size, etc)? Any insight/guidance would be > greatly appreciated. > > Also if anyone could point me to what exactly filters do (docs) I would > appreciate it. My assumption is that they inject additional tokens based on > the specific filter class. Am I correct?