Hi I would like to build a component that during indexing analyses all tokens in a stream and adds metadata to a new field based on my analysis. I have different tasks that I would like to perform, like basic classification and certain more advanced phrase detections. How would I do this? A normal TokenFilter can only look at one token a time, but I need to access a larger context.
I've noticed that there is a TeeSinkTokenFilter that might be useful in someway since "It is also useful for doing things like entity extraction or proper noun analysis", but I don't understand how. Can someone help me with some super-simple stub or similar? What I'm looking for is something like: class MySmartFilter { public AnalyzeTokens(tokenList) { metadataTokens = DoTheAnalysis(tokenList); AddToField("metadata", metadataTokens); } } Any help is much appreciated! Thanks /Bjorn -- View this message in context: http://lucene.472066.n3.nabble.com/Analysing-all-tokens-in-a-stream-tp2811516p2811516.html Sent from the Solr - User mailing list archive at Nabble.com.