This may help you get started: https://lucidworks.com/2012/02/14/indexing-with-solrj/
Best, Erick On Thu, Jun 21, 2018 at 8:11 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 6/20/2018 9:05 AM, neotorand wrote: >> >> I have a specific Requirement where i need to index below things >> >> Meta Data of any document >> Some parts from the Document that matches some keywords that i configure >> >> The first part i am able to achieve through ERH or >> FilelistEntityProcessor. >> >> I am struggling on second part.I am looking for an effective and smart >> approach to handle this. >> Can any one give me a pointer or help with this. > > > Write a custom indexing program to compile precisely the information that > you need and send that to Solr. > > Yes, that is a serious suggestion. Solr itself is very capable, but it > can't do everything that every user's specific business requirements > dictate. A large percentage of Solr users have written custom indexing > programs. > > It is strongly recommended that the ExtractingRequestHandler never be used > in production, because the Tika software it utilizes is prone to serious > problems that might extend as far as an actual program crash. If Tika > crashes and it's running inside Solr, then Solr crashes too. Running Tika > in a custom indexing program instead is recommended, so that if it crashes, > it's only the indexing program that dies, not Solr. > > Thanks, > Shawn >