> Thanks Ahmet, > > Thats excellent, thanks :) I may have to increase the > gramsize to take into account other possible uses but i can > now read around these filters to make the adjustments. > > With regard to WordDelimiterFilterFactory. Is there a way > to place a delimiter on this filter to still get most of its > functionality without it absorbing the + signs?
Yes you are right, preserveOriginal="1" will causes the original token to be indexed without modifications. > Will i loose a lot of 'good' functionality by removing it? It depends of your input data. It is used to break one token into subwords. Like: "Wi-Fi" -> "Wi", "Fi" and "PowerShot" -> "Power", "Shot" If you input data set contains such words, you may need it. But I think just to make last character searchable, using NGramFilter(s) is not an optimal solution. I don't know what type of dataset you have but, I think using separate two fields (with different types) for that is more suitable. One field will contain actual data itself. The other will hold only the last character(s). You can achieve this by a copyField or programatically during indexing. The type of the field lastCharsField will be using EdgeNGramFilter so that only last character of token(s) will pass that filter. During searching you will search those two fields: originalField:\+ OR lastCharsField:\+ The query lastCharsField:\+ will return you all the products ending with +. Hope this helps.