(I think I have a horrible subject line but I wasnt sure how to
properly explain myself).
I have a text field that I store last names in (and everything is
lowercased prior to insertion, not sure if that matters).
The field is described as:
<field name="last_name" type="text" indexed="true" stored="false"
multiValued="true"/>
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
When running a query such as
last_name:m*
I get data back like:
Pashman, Md
Maldonado
Manolidis
Fleisher, M.D., D.Ht., D.A.B.F.M.
Merino
Monroe
McLay
Maltsberger
McMurtray
Murphy Md
Loeb Md
As you can see most are perfect matches, but there are some that
*dont* start with the letter "M" but do have "M" at the beginning of
another "word" in the field.
Wouldnt the query "m*" just query for matches where the first letter
is "M" in the whole field and not within another "word" in that field?
Do I need to make another field to store last names and not perform
any analysis on that field (akin to a spell check field)?
Thanks in advance.
-Rupert