Analyzer for indexing only, not for queries

Michael Kuhlmann Fri, 12 Mar 2010 00:00:59 -0800

Hi all,

I have a field with some kind of category tree as a string. The format
is like this:
        "prefix>first>second#prefix>otherfirst>othersecond"


So, the document is categorized in two categories, separated by '#', and
all categories start with the same prefix which I don't want to use.

For indexing, I have some fields for each category level, filled by
copyFields. For instance, the first level is defined using this type:

<fieldType name="text_first_cat" class="solr.TextField"
positionIncrementGap="100">
       <analyzer type="index">
          <tokenizer class="solr.PatternTokenizerFactory"
                  pattern="(?:#|^)\w*>([\p{L}\d]+)" group="1"/>
        </analyzer>
    </fieldType>

This works fine, except one thing: This analyzer is being used for
queries also, not only for indexing. So, a query for "x>first" gets
results, but a query for "first" only finds nothing. However, I want the
latter case.

If I add some pseudo-analyzer that does nothing like this:
       <analyzer type="query">
          <tokenizer class="solr.PatternTokenizerFactory"
                  pattern=".*" group="0"/>
        </analyzer>
then I get the result that I want. If I don't add a query analyzer at
all, the index analyzer is being used for queries, what is strange and
not what I would expect.

I just want some
Take-the-query-as-it-is-and-do-nothing-with-it-Analyzer, as if I don't
specify some analyzer at all. However, if I simply add
        <analyzer type="query" />
to it, I get a parser exception from Solr.

Is there a clean solution for this? And why is Solr ignoring the
analyzer type as long as there is only one analyzer defines per type?

Greetings,
Michael

Analyzer for indexing only, not for queries

Reply via email to