Hi
I would like to build a component that during indexing analyses all tokens
in a stream and adds metadata to a new field based on my analysis. I have
different tasks that I would like to perform, like basic classification and
certain more advanced phrase detections. How would I do this? A normal
TokenFilter can only look at one token a time, but I need to access a larger
context.
I've noticed that there is a TeeSinkTokenFilter that might be useful in
someway since "It is also useful for doing things like entity extraction or
proper noun analysis", but I don't understand how.
Can someone help me with some super-simple stub or similar? What I'm looking
for is something like:
class MySmartFilter {
public AnalyzeTokens(tokenList)
{
metadataTokens = DoTheAnalysis(tokenList);
AddToField("metadata", metadataTokens);
}
}
Any help is much appreciated!
Thanks
/Bjorn
--
View this message in context:
http://lucene.472066.n3.nabble.com/Analysing-all-tokens-in-a-stream-tp2811516p2811516.html
Sent from the Solr - User mailing list archive at Nabble.com.