Re: Tokenizer question

Avlesh Singh Mon, 11 Jan 2010 20:14:52 -0800

>
> If the analyzer produces multiple Tokens, but they all have the same
> position then the QueryParser produces a BooleanQuery will all SHOULD
> clauses.  -- This is what allows simple synonyms to work.
>
You rock Hoss!!! This is exactly the explanation I was looking for .. it is
as simple as it sounds. Thanks!


Cheers
Avlesh

On Tue, Jan 12, 2010 at 6:37 AM, Chris Hostetter
<hossman_luc...@fucit.org>wrote:

>
> : q=PostCode:(1078 pw)+AND+HouseNumber:(39-43)
> :
> : the resulting parsed query contains a phrase query:
> :
> : +(PostCode:1078 PostCode:pw) +PhraseQuery(HouseNumber:"39 43")
>
> This stems from some fairly fundemental behavior i nthe QueryParser ...
> each "chunk" of input that isn't deemed "markup (ie: not field names, or
> special characters) is sent to the analyzer.  If the analyzer produces
> multiple tokens at differnet positions, then a PhraseQuery is constructed.
> -- Things like simple phrase searchs and N-Gram based partial matching
> require this behavior.
>
> If the analyzer produces multiple Tokens, but they all have the same
> position then the QueryParser produces a BooleanQuery will all SHOULD
> clauses.  -- This is what allows simple synonyms to work.
>
> If you write a simple TokenFilter to "flatten" all of the positions to be
> the same, and use it after WordDelimiterFilter then it should give you the
> "OR" style query you want.
>
> This isn't hte default behavior because the Phrase behavior of WDF fits
> it's intended case better --- someone searching for a product sku
> like X3QZ-D5 expects it to match X-3QZD5, but not just "X" or "3QZ"
>
> -Hoss
>
>

Re: Tokenizer question

Reply via email to