Hello, I am trying to use part of speech tagger for bahasa Indonesia to filter tokens in Solr. The tagger receive input as word list of a sentence and return tag array.
I think the process should by like this: - tokenize sentence - tokenize word - pass it into the tagger - set attribute using tagger output - pass it into a FilteringTokenFilter implementation Is it possible to do this in Solr/Lucene? If it is, how? I've read similar solution for Japanese language but since I am lack of Japanese understanding, it couldn't help a lot. -- Regards, Rendy Bambang Junior Informatics Engineering '09 Bandung Institute of Technology