Re: keyword query tokenizer

Tommy Chheng Thu, 25 Mar 2010 20:25:38 -0700

 Multi-field searches is one reason of doing the tokenizing in the parser.


Imagine if your query was "name:bob content:climate"

The parser can tokenize the query into "name:bob", "content:climate" andpass each into their own analyzer.


Tommy Chheng
Programmer and UC Irvine Graduate Student
Twitter @tommychheng
http://tommy.chheng.com


On 3/25/10 7:37 PM, Jason Chaffee wrote:

I am curious as to why the query parser does any tokenizing? I wouldthink you would want control/configure this with your analyzers?
Does anyone know the answer to this. Is there a performance gain orsomething?
Thanks,

Jason

On Mar 25, 2010, at 4:04 PM, "Ahmet Arslan" <iori...@yahoo.com> wrote:
> I have the following configured for a
> particular field:
>
>
>
> <analyzer type="query">
>
> <tokenizer
> class="solr.KeywordTokenizerFactory" />
>
> <filter
> class="solr.LowerCaseFilterFactory" />
>
> </analyzer>
>
>
>
>
>
> I am using dismax and querying multiple fields and I expect
> the query to
> be parsed different for each field.  For some reason,
> it is not kept as
> single token for this field's query.  For example, the
> query "Apple
> Store"  is being broken into two tokens, "apple" and
> "store".  I would
> expect it to be "apple store".
>
>
>
> Does anyone have ideas of what might be going on here?
Before analysis phase, QueryParser splits on whitespace. You canalter this behavior by escaping whitespace with back slash. apple\ store

Re: keyword query tokenizer

Reply via email to