query parsing & wildcards

Charles Hornberger Wed, 28 Nov 2007 09:43:17 -0800

I'm confused by some behavior I'm seeing in Solr (i'm using 1.2.0). I
have a field named "description", declared with the following
fieldType:


    <fieldType name="textTightUnstemmed" class="solr.TextField"
positionIncrementGap="100" >
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="0" catenateWords="1"
catenateNumbers="1" catenateAll="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>

The problem I'm having is that when I search for description:deck*, I
get the results I expect; when I search for description:Deck*, I get
nothing. I want both queries to return the same result set. (I'm using
the standard request handler.)

Interestingly, when I search for description:Deck from the web
interface, the debug output shows that the query term is converted to
lowercase:

<str name="rawquerystring">description:Deck</str>
<str name="querystring">description:Deck</str>
<str name="parsedquery">description:deck</str>
<str name="parsedquery_toString">description:deck</str>

... but when I search for description:Deck*, it shows that it is not:

<str name="rawquerystring">description:Deck*</str>
<str name="querystring">description:Deck*</str>
<str name="parsedquery">description:Deck*</str>
<str name="parsedquery_toString">description:Deck*</str>

What am I doing wrong here?

Also, when I use the Field Analysis tool for description:Deck*, it
shows the following (sorry for the bad copy/paste):

Query Analyzer
org.apache.solr.analysis.WhitespaceTokenizerFactory {}
term position   1
term text       Deck*
term type       word
source start,end        0,5
org.apache.solr.analysis.SynonymFilterFactory {synonyms=synonyms.txt,
expand=false, ignoreCase=true}
term position   1
term text       Deck*
term type       word
source start,end        0,5
org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true}
term position   1
term text       Deck*
term type       word
source start,end        0,5
org.apache.solr.analysis.WordDelimiterFilterFactory
{generateNumberParts=0, catenateWords=1, generateWordParts=0,
catenateAll=0, catenateNumbers=1}
term position   1
term text       Deck
term type       word
source start,end        0,4
org.apache.solr.analysis.LowerCaseFilterFactory {}
term position   1
term text       deck
term type       word
source start,end        0,4
org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}
term position   1
term text       deck
term type       word
source start,end        0,4

Thanks,
Charlie

query parsing & wildcards

Reply via email to