Hello,
There are lots of questions and answers in the forum regarding varying wildcard 
behaviour, but I haven't been able to find any
that address this particular behaviour. Perhaps someone could help?
Problem:
I have a fieldType that only goes through a KeywordTokenizer at index time, to 
ensure it stays 'verbatim' (e.g. it doesn't get split into any tokens - ws or 
otherwise).
Let's say there's some data stored in this field like this:


Something
Something Else
Something Else Altogether


When I query:  "Something" or "Something Else" or "*thing"  or "*omething*", I 
get back the expected results.
If, however, I query: "Some*" or "S*" or "s*" etc, I get no results (although 
this type of non-leading wildcard works fine with other fieldType schema 
elements that don't use KeywordTokenizer).
Is this something to do with KeywordTokenizer?
Is there a better way to index data (preserving case) and not splitting on ws 
or stemming etc. (i.e. no WhitespaceTokenizer or similar)?
My fieldType schema looks like this: (I've tried a number of other combinations 
as well including using class=solr.TextField)
    <fieldType name="text_verbatim" class="solr.StrField" 
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
ignoreCase="true" expand="true"/>
      </analyzer>
    </fieldType>
 
    <field name="appname"  type="text_verbatim" indexed="true" stored="true"/>

I understand that wildcard queries don't go through analyzers, but why is it 
that 'tokenized' data matches on non-leading wildcard queries, whereas 
non-tokenized (or more specifically Keyword-Tokenized) doesn't?
The fieldType schema requires some tokenizer class, and it appears that 
KeywordTokenizer is the only one that tokenizes to a token size of 1 (i.e. the 
whole string).
I'm sure I'm missing something that is probably reasonbly obvious, but having 
tried myriad combinations, I thought it prudent to ask the experts in the forum.
 
Many thanks for any insight you can provide on this.
 
Peter
 

                                          
_________________________________________________________________
Use Hotmail to send and receive mail from your different email accounts
http://clk.atdmt.com/UKM/go/186394592/direct/01/

Reply via email to