I am looking through the schema of a Solr installation that I inherited last year. The original dev, who is unavailable for comment, has two types of text fields: one with RemoveDuplicatesTokenFilterFactory and one without. These fields are intended for full-text search.
Why would someone _not_ use RemoveDuplicatesTokenFilterFactory on a field intended for full-text search? What are the drawbacks to using it? This application is very, very write heavy (hundreds of writes per minute) if that matters. It was running on websolr.com at the time, I've now moved it to Amazon Web Services. Thanks. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com