I'm doing a combination of update processor and token filter. The
token filter is necessary to reduce the duplicates after stemming has
occurred.

David

2009/6/4 Noble Paul നോബിള്‍  नोब्ळ् <noble.p...@corp.aol.com>:
> isn't better to use an UpdateProcessor  for this?
>
> On Thu, Jun 4, 2009 at 1:52 AM, Otis Gospodnetic
> <otis_gospodne...@yahoo.com> wrote:
>>
>> Hello,
>>
>> It's ugly, but the first thing that came to mind was ThreadLocal.
>>
>>  Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>>
>> ----- Original Message ----
>>> From: David Giffin <da...@giffin.org>
>>> To: solr-user@lucene.apache.org
>>> Sent: Wednesday, June 3, 2009 1:57:42 PM
>>> Subject: Token filter on multivalue field
>>>
>>> Hi There,
>>>
>>> I'm working on a unique token filter, to eliminate duplicates on a
>>> multivalue field. My filter works properly for a single value field.
>>> It seems that a new TokenFilter is created for each value in the
>>> multivalue field. I need to maintain an array of used tokens across
>>> all of the values in the multivalue field. Is there a good way to do
>>> this? Here is my current code:
>>>
>>> public class UniqueTokenFilter extends TokenFilter {
>>>
>>>     private ArrayList words;
>>>     public UniqueTokenFilter(TokenStream input) {
>>>         super(input);
>>>         this.words = new ArrayList();
>>>     }
>>>
>>>     @Override
>>>     public final Token next(Token in) throws IOException {
>>>         for (Token token=input.next(in); token!=null; token=input.next()) {
>>>             if ( !words.contains(token.term()) ) {
>>>                 words.add(token.term());
>>>                 return token;
>>>             }
>>>         }
>>>         return null;
>>>     }
>>> }
>>>
>>> Thanks,
>>> David
>>
>>
>
>
>
> --
> -----------------------------------------------------
> Noble Paul | Principal Engineer| AOL | http://aol.com
>

Reply via email to