Multi-field searches is one reason of doing the tokenizing in the parser.

Imagine if your query was "name:bob content:climate"

The parser can tokenize the query into "name:bob", "content:climate" and pass each into their own analyzer.

Tommy Chheng
Programmer and UC Irvine Graduate Student
Twitter @tommychheng
http://tommy.chheng.com


On 3/25/10 7:37 PM, Jason Chaffee wrote:
I am curious as to why the query parser does any tokenizing? I would think you would want control/configure this with your analyzers?

Does anyone know the answer to this. Is there a performance gain or something?

Thanks,

Jason

On Mar 25, 2010, at 4:04 PM, "Ahmet Arslan" <iori...@yahoo.com> wrote:

> I have the following configured for a
> particular field:
>
>
>
> <analyzer type="query">
>
> <tokenizer
> class="solr.KeywordTokenizerFactory" />
>
> <filter
> class="solr.LowerCaseFilterFactory" />
>
> </analyzer>
>
>
>
>
>
> I am using dismax and querying multiple fields and I expect
> the query to
> be parsed different for each field.  For some reason,
> it is not kept as
> single token for this field's query.  For example, the
> query "Apple
> Store"  is being broken into two tokens, "apple" and
> "store".  I would
> expect it to be "apple store".
>
>
>
> Does anyone have ideas of what might be going on here?

Before analysis phase, QueryParser splits on whitespace. You can alter this behavior by escaping whitespace with back slash. apple\ store




Reply via email to