Re: Searching for the '+' character

AHMET ARSLAN Mon, 14 Sep 2009 11:42:48 -0700

> Thanks Ahmet,
> 
> Thats excellent, thanks :) I may have to increase the
> gramsize to take into account other possible uses but i can
> now read around these filters to make the adjustments.
> 
> With regard to WordDelimiterFilterFactory. Is there a way
> to place a delimiter on this filter to still get most of its
> functionality without it absorbing the + signs?


Yes you are right, preserveOriginal="1" will causes the original token to be 
indexed without modifications.

> Will i loose a lot of 'good' functionality by removing it?

It depends of your input data. It is used to break one token into subwords.
Like: "Wi-Fi" -> "Wi", "Fi" and "PowerShot" -> "Power", "Shot"
If you input data set contains such words, you may need it.

But I think just to make last character searchable, using NGramFilter(s) is not 
an optimal solution. I don't know what type of dataset you have but, I think 
using separate two fields (with different types) for that is more suitable. One 
field will contain actual data itself. The other will hold only the last 
character(s).

You can achieve this by a copyField or programatically during indexing. The 
type of the field lastCharsField will be using EdgeNGramFilter so that only 
last character of token(s) will pass that filter.

During searching you will search those two fields: 
originalField:\+ OR lastCharsField:\+

The query lastCharsField:\+ will return you all the products ending with +.

Hope this helps.

Re: Searching for the '+' character

Reply via email to