Hello,
There are lots of questions and answers in the forum regarding varying wildcard
behaviour, but I haven't been able to find any
that address this particular behaviour. Perhaps someone could help?
Problem:
I have a fieldType that only goes through a KeywordTokenizer at index time, to
ensure it stays 'verbatim' (e.g. it doesn't get split into any tokens - ws or
otherwise).
Let's say there's some data stored in this field like this:
Something
Something Else
Something Else Altogether
When I query: "Something" or "Something Else" or "*thing" or "*omething*", I
get back the expected results.
If, however, I query: "Some*" or "S*" or "s*" etc, I get no results (although
this type of non-leading wildcard works fine with other fieldType schema
elements that don't use KeywordTokenizer).
Is this something to do with KeywordTokenizer?
Is there a better way to index data (preserving case) and not splitting on ws
or stemming etc. (i.e. no WhitespaceTokenizer or similar)?
My fieldType schema looks like this: (I've tried a number of other combinations
as well including using class=solr.TextField)
<fieldType name="text_verbatim" class="solr.StrField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
</analyzer>
</fieldType>
<field name="appname" type="text_verbatim" indexed="true" stored="true"/>
I understand that wildcard queries don't go through analyzers, but why is it
that 'tokenized' data matches on non-leading wildcard queries, whereas
non-tokenized (or more specifically Keyword-Tokenized) doesn't?
The fieldType schema requires some tokenizer class, and it appears that
KeywordTokenizer is the only one that tokenizes to a token size of 1 (i.e. the
whole string).
I'm sure I'm missing something that is probably reasonbly obvious, but having
tried myriad combinations, I thought it prudent to ask the experts in the forum.
Many thanks for any insight you can provide on this.
Peter
_________________________________________________________________
Use Hotmail to send and receive mail from your different email accounts
http://clk.atdmt.com/UKM/go/186394592/direct/01/