Re: Implications of setting catenateAll=1

2011-11-18 Thread Erick Erickson
The main one is that you can get an explosion in the number of terms, depending on your input, especially if you have things that aren't regular text. Imagine partone-1 partone-2 partone-3 parttwo-1 parttwo-2 parttwo-3 if catenateall is set to 0, you;d get 5 tokens here. If it was set to 1 you'd

Implications of setting catenateAll=1

2011-11-17 Thread Brendan Grainger
Hi, The default for catenateAll is 0 which we've been using on the WordDelimiterFilter. What would be the possibly negative implications of setting this to 1? So that: wi-fi-800 would produce the tokens: wi, fi, wifi, 800, wifi800 for example? Thanks