I should have Googled better. It seems that my question has been asked and answered already, and not just once:
http://www.nabble.com/Using-wildcard-with-accented-words-tf4673239.html http://groups.google.com/group/acts_as_solr/browse_thread/thread/42920dc2dcc5fa88 On Nov 28, 2007 9:42 AM, Charles Hornberger <[EMAIL PROTECTED]> wrote: > I'm confused by some behavior I'm seeing in Solr (i'm using 1.2.0). I > have a field named "description", declared with the following > fieldType: > > <fieldType name="textTightUnstemmed" class="solr.TextField" > positionIncrementGap="100" > > <analyzer> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" > synonyms="synonyms.txt" ignoreCase="true" expand="false"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="0" generateNumberParts="0" catenateWords="1" > catenateNumbers="1" catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > The problem I'm having is that when I search for description:deck*, I > get the results I expect; when I search for description:Deck*, I get > nothing. I want both queries to return the same result set. (I'm using > the standard request handler.) > > Interestingly, when I search for description:Deck from the web > interface, the debug output shows that the query term is converted to > lowercase: > > <str name="rawquerystring">description:Deck</str> > <str name="querystring">description:Deck</str> > <str name="parsedquery">description:deck</str> > <str name="parsedquery_toString">description:deck</str> > > ... but when I search for description:Deck*, it shows that it is not: > > <str name="rawquerystring">description:Deck*</str> > <str name="querystring">description:Deck*</str> > <str name="parsedquery">description:Deck*</str> > <str name="parsedquery_toString">description:Deck*</str> > > What am I doing wrong here? > > Also, when I use the Field Analysis tool for description:Deck*, it > shows the following (sorry for the bad copy/paste): > > Query Analyzer > org.apache.solr.analysis.WhitespaceTokenizerFactory {} > term position 1 > term text Deck* > term type word > source start,end 0,5 > org.apache.solr.analysis.SynonymFilterFactory {synonyms=synonyms.txt, > expand=false, ignoreCase=true} > term position 1 > term text Deck* > term type word > source start,end 0,5 > org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt, > ignoreCase=true} > term position 1 > term text Deck* > term type word > source start,end 0,5 > org.apache.solr.analysis.WordDelimiterFilterFactory > {generateNumberParts=0, catenateWords=1, generateWordParts=0, > catenateAll=0, catenateNumbers=1} > term position 1 > term text Deck > term type word > source start,end 0,4 > org.apache.solr.analysis.LowerCaseFilterFactory {} > term position 1 > term text deck > term type word > source start,end 0,4 > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {} > term position 1 > term text deck > term type word > source start,end 0,4 > > Thanks, > Charlie >