Searching for tokens does not return any results

Yetkin Ozkucur Thu, 01 May 2014 07:05:05 -0700

Hello everyone,

I am new to SOLR and this is my first post in this list. 
I have been working on this problem for a couple of days. I tried everything 
which I found in google but it looks like I am missing something.


Here is my problem:
I have a field called: DBASE_LOCAT_NM_TEXT
It contains values like: CRD_PROD
The goal is to be able to search this field either by putting the exact string 
"CRD_PROD" or part of it (tokenized by "_")  like "CRD" or "PROD"

Currently: 
This query returns results: q=DBASE_LOCAT_NM_TEXT:CRD_PROD
But this does not: q=DBASE_LOCAT_NM_TEXT:CRD
I want to understand why the second query does not return any results

Here is how I configured the field:
<field name="DBASE_LOCAT_NM_TEXT" type="text_general" indexed="true" 
stored="true" required="false" multiValued="false"/>

And Here is how I configured the field type :
    <fieldType name="text_general" class="solr.TextField" 
positionIncrementGap="100">
      <analyzer type="index">
      <filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" 
generateWordParts="1" generateNumberParts="1" catenateWords="1" 
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory"  ignoreCase="true" 
words="stopwords.txt"/>
         <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" 
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" 
generateWordParts="1" generateNumberParts="1" catenateWords="0" 
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt"/>

        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" 
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>

      </analyzer>
    </fieldType>

I am also using the analysis panel in the SOLR admin console. It shows this:
WT      CRD_PROD

WDF     CRD_PROD
        CRD
        PROD
        CRDPROD

SF      CRD_PROD
        CRD
        PROD
        CRDPROD

LCF     crd_prod
        crd
        prod
        crdprod

SKMF    crd_prod
        crd
        prod
        crdprod

RDTF    crd_prod
        crd
        prod
        crdprod


I am not sure if it is related or not but this index was created using a Java 
program using Lucene interface. It used StandardAnalyzer for writing and the 
field was configured as tokenized, indexed and stored.  Does this affect the 
SOLR configuration?
        
Can you please help me understand what I am missing and how I can debug it?

Thanks,
Yetkin

Searching for tokens does not return any results

Reply via email to