Hi

I would like to build a component that during indexing analyses all tokens
in a stream and adds metadata to a new field based on my analysis. I have
different tasks that I would like to perform, like basic classification and
certain more advanced phrase detections. How would I do this? A normal
TokenFilter can only look at one token a time, but I need to access a larger
context.

I've noticed that there is a TeeSinkTokenFilter that might be useful in
someway since "It is also useful for doing things like entity extraction or
proper noun analysis", but I don't understand how.

Can someone help me with some super-simple stub or similar? What I'm looking
for is something like:

class MySmartFilter  {

  public AnalyzeTokens(tokenList)
 {
       metadataTokens = DoTheAnalysis(tokenList);
       AddToField("metadata", metadataTokens);
 }
}

Any help is much appreciated!
Thanks
/Bjorn

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Analysing-all-tokens-in-a-stream-tp2811516p2811516.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to