Matching Queries with Wildcards and Numbers

Ellington Kirby Wed, 17 Jun 2015 10:59:33 -0700

Hi! I am a Solr user having an issue with matches on searches using the
wildcard operators, specifically when the searches include a wildcard
operator with a number. Here is an example.
My query will look like (productTitle:*Sidem2*) and match nothing, when it
should be matching the productTitle Sidem2. However, searching for Sidem
will match the productTitle Sidem2. In addition, I have isolated it to only
fail to match when the productTitle has a number in it, for example a query
for (productTitle:*Cupx Collapsed*) will correctly match the product Cupx
Collapsed. I need to use the wildcard operators around the query so that an
auto-complete feature can be used, where if a user stops typing at a
certain point, a search will be executed on their input so far and it will
match the correct product titles. I have looked all over, through the
excellent book Solr In Action by Grainger and Potter, through Stack
Overflow and several blog posts and have not found anything on this
specific issue. Common advice is to remove the stemmer, which I have done.
I have also added the ReversedWildcardFilterFactory. Here is a copy of my
schema for the specific fieldType if that is any help. Please let me know
if anyone has any tips or clues! I am not a very experienced Solr user and
would really appreciate any advice.



  <fieldType name="text_en_splitting" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="true">
        <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
            <!-- Case insensitive stop word removal.
        -->
            <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                />
            <!-- Concatenate characters and numbers by setting catenateAll
to 1 - this will avoid problems with alphabetical sort -->
            <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
            <filter class="solr.ReversedWildcardFilterFactory"
withOriginal="true"
             maxPosAsterisk="2" maxPosQuestion="1" minTrailing="2"
maxFractionAsterisk="0"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                />
            <!-- Concatenate characters and numbers by setting catenateAll
to 1 - this will avoid problems with alphabetical sort -->
            <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"
preserveOriginal="1"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        </analyzer>
    </fieldType>


Thank you in advance!
--From a sincerely puzzled Solr user, Ellington Kirby

Matching Queries with Wildcards and Numbers

Reply via email to