On 6/20/2018 9:05 AM, neotorand wrote:
I have a specific Requirement where i need to index below things

Meta Data of any document
Some parts from the Document that matches some keywords that i configure

The first part i am able to achieve through ERH or FilelistEntityProcessor.

I am struggling on second part.I am looking for an effective and smart
approach to handle this.
Can any one give me a pointer or help with this.

Write a custom indexing program to compile precisely the information that you need and send that to Solr.

Yes, that is a serious suggestion.  Solr itself is very capable, but it can't do everything that every user's specific business requirements dictate.  A large percentage of Solr users have written custom indexing programs.

It is strongly recommended that the ExtractingRequestHandler never be used in production, because the Tika software it utilizes is prone to serious problems that might extend as far as an actual program crash.  If Tika crashes and it's running inside Solr, then Solr crashes too.  Running Tika in a custom indexing program instead is recommended, so that if it crashes, it's only the indexing program that dies, not Solr.

Thanks,
Shawn

Reply via email to