A field-wide remove duplicate tokens filter

Varun Rajput Wed, 17 Dec 2014 14:42:58 -0800

The org.apache.solr.analysis.RemoveDuplicatesTokenFilter, as per its 
description, "Filters out any tokens which are at the same logical position in 
the tokenstream as a previous token with the same text."
A very useful filter would be one which filters out duplicate tokens throughout 
the field, irrespective of the logical position of the token. Does something 
like this exist already or is being planned to be included in the coming 
releases?
I have an implementation of this in one of my project and can contribute if the 
community finds it useful as well.
Best,Varun

A field-wide remove duplicate tokens filter

Reply via email to