Hi,

I use Tika through the Solr ExtractingRequestHandler and I face a very
common use case namely: postprocessing fields from Tika in order to normalize
their values or override them with explicitly passed "literal" values.

With exception of some vagues statements about "ContentHandler", I
failed to find some good examples about this (while it appears to be
quite an important feature)

Does anyone knows of some good resources/samples about the proper way to
"postprocess" fields from both Tika results and explicit values ?


PS: I primary thought it was up to the Tika API but have I been
redirected here as Tika only deals with XML/xpath and fields are in the
scope of Solr ExtractingRequestHandler only.


thank you in advance

Reply via email to