: Here is the string to be indexed without duplication. : : Kitchen Cabinet Utah Kitchen Remodeling Utah : : Is RemoveDuplicatesTokenFilterFactory for this solution? or for something : else?
it depeneds on what you want to do ... you've given us an example of some input, but you haven't elaborated on what solution you want. This is hte documentation for RemoveDuplicatesTokenFilter... A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream. ...it only removes duplicates that occur at the same position, so if your goal is to only have "Kitchen" and "Utah" indexed once, then it will only od that if you have a tokenizer (or some other token filter) that flattens out the positionIncrements of all the tokens to 0. -Hoss