LanguageDetection inside of ExtractingRequestHandler

Martin Ruckli Tue, 19 Jun 2012 08:11:09 -0700

Hi all,

I just wanted to check if there is a demand for this feature. I had to 
implement this functionality for one of our customers and would like to 
contribute it.


Here is the use case:
We are using the ExtractingRequestHandler with the extractOnly=true flag set.
With a request to this handler we get the content of a posted document like we 
want to. We would also like to detect the language and return it as a metadata 
field in the response from solr.
As there is already support for LanguageDetection based on tika integrated into 
solr, the only thing what I did was add a new param to enable or disable this 
feature and then do the language detection nearly the same way as it is done in 
the TikaLanguageIdentifierUpdateProcessor
I think this would be a nice addition, especially in the extractOnly mode.

What are your thoughts on this?

Cheers
Martin

LanguageDetection inside of ExtractingRequestHandler

Reply via email to