Re: Need help with troublesome wildcard query

Christopher Cato Fri, 08 Jul 2011 06:04:37 -0700

Hi Briggs. Thanks for taking the time. I have the query nearly working now, 
currently this is how it looks when it matches on the title "Super Technocrane 
30" and others with similar names:


INFO: [] webapp=/solr path=/select/ 
params={qf=title^40.0&hl.fl=title&wt=json&rows=10&fl=*,score&start=0&q=(title:*super*+AND+*technocran*)+OR+(title:*super*+AND+*technocran)&qt=standard&fq=type:product+AND+language:sv}
 hits=3 status=0 QTime=1 

Adding another letter stops it matching:

INFO: [] webapp=/solr path=/select/ 
params={qf=title^40.0&hl.fl=title&wt=json&rows=10&fl=*,score&start=0&q=(title:*super*+AND+*technocrane*)+OR+(title:*super*+AND+*technocrane)&qt=standard&fq=type:product+AND+language:sv}
 hits=0 status=0 QTime=0 

The field type definitions are as follows:

<field name="title" type="text" indexed="true" stored="true" termVectors="true" 
omitNorms="true"/>

    <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <charFilter class="solr.MappingCharFilterFactory" 
mapping="mapping-ISOLatin1Accent.txt"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" 
ignoreCase="true" expand="false"/>
        -->
        <!-- Case insensitive stop word removal.
          add enablePositionIncrements=true in both the index and query
          analyzers to leave a 'gap' for more accurate phrase queries.
        -->
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory"
                generateWordParts="1"
                generateNumberParts="1"
                catenateWords="1"
                catenateNumbers="1"
                catenateAll="0"
                splitOnCaseChange="1"
                preserveOriginal="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English" 
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <charFilter class="solr.MappingCharFilterFactory" 
mapping="mapping-ISOLatin1Accent.txt"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory"
                generateWordParts="1"
                generateNumberParts="1"
                catenateWords="0"
                catenateNumbers="0"
                catenateAll="0"
                splitOnCaseChange="1"
                preserveOriginal="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English" 
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>


There is also a type definition that is called text_ws, should I use that 
instead and change text to text_ws in the field definition for title?

    <!-- A text field that only splits on whitespace for exact matching of 
words -->
    <fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      </analyzer>
    </fieldType>




Mvh

Christopher Cato
Teknikchef
-----------------------------------
MiniMedia
Phone: +46761927603
www.minimedia.se

7 jul 2011 kl. 23.16 skrev Briggs Thompson:

> Hello Christopher,
> 
> Can you provide the exact query sent to Solr for the one word query and also
> the two word query? The field type definition for your title field would be
> useful too.
> 
> From what I understand, Solr should be able to handle your use case. I am
> guessing it is a problem with how the field is defined assuming the query is
> correct.
> 
> Briggs Thompson
> 
> On Thu, Jul 7, 2011 at 12:22 PM, Christopher Cato <
> christopher.c...@minimedia.se> wrote:
> 
>> Hi, I'm running Solr 3.2 with edismax under Tomcat 6 via Drupal.
>> 
>> I'm having some problems writing a query that matches a specific field on
>> several words. I have implemented an AJAX search that basically takes
>> whatever is in a form field and attempts to match documents. I'm not having
>> much luck though. First word always matches correctly but as soon as I enter
>> the second word I'm loosing matches, the third word doesn't give any matches
>> at all.
>> 
>> The title field that I'm searching contains a product name that may or may
>> not have several words.
>> 
>> The requirement is that the search should be progressive i.e. as the user
>> inputs words I should always return results that contain all of the words
>> entered. I also have to correct bad input like an erraneous space in the
>> product name ex. "product name" instead of "productname".
>> 
>> I'm wondering if there isn't an easier way to query Solr? Ideally I'd want
>> to say "give me all docs that have the following text in it's titles" Is
>> that possible?
>> 
>> 
>> I'd really appreciate any help!
>> 
>> 
>> Regards,
>> Christopher Cato

Re: Need help with troublesome wildcard query

Reply via email to