Norberto Meijome wrote:
ok well let's say that i can live without john/jon in the short term.
what i really need today is a case insensitive wildcard search with
literal matching (no fancy stemming. bobby is bobby, not bobbi.)
what are my options?
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
define your own type (or modify text / string... but I find that it gets
confusing to have variations of text / string ...) to perform the operations on
the content as needed.
There are also other tokenizer/analysers available that *may* help in the
partial searches (ngram , edgengram ), but there isn't much documentation on
them yet (that I could find) - I am only getting into them myself ....i'll see
how it goes..
thanks, that got me on the right track. i came up with this:
<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
now searching for user_name:bobby* works as i wanted.
my next question: is there a way that i can score matches that are at
the start of the string higher than matches in the middle? for example,
if i search for steve, i get kelly stevenson before steve jobs. i'd
like steve jobs to come first.
-jsd-