[
https://issues.apache.org/jira/browse/LABS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632754#action_12632754
]
Javier Puerto commented on LABS-118:
------------------------------------
I was thinking in the same way. We must implement a controller that iterate
over a list of handlers by a common interface. But i doubt between use
ByteArrayInputStream or a Writer because tika output is text (the encoding?).
The TeeContentHandler is great but the handlers run in paralell not in chain,
it could be usefull in the last stage of the process when the data not need
more transformations.
The stages could be:
1 Parse [and LinkExtraction?]
2 Handler
3 Action
> Create tied integration with Apache Tika (for parser and handler)
> -----------------------------------------------------------------
>
> Key: LABS-118
> URL: https://issues.apache.org/jira/browse/LABS-118
> Project: Labs
> Issue Type: New Feature
> Components: Droids
> Reporter: Thorsten Scherler
>
> http://incubator.apache.org/tika/
> Apache Tika is a toolkit for detecting and extracting metadata and structured
> text content from various documents using existing parser libraries.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]