Hi What are the pros and cons of both use cases? 1. use nutch to crawl file system + parse files + perform other data manipulation and eventually index to solr. 2. use solr dataimporthandlers and plugins in order to perform this task.
Note that I have tens of millions of docs which I need to handle the first time, and then delta imports of around 100k docs per day. Each doc may be up to 100mb. -- View this message in context: http://lucene.472066.n3.nabble.com/using-tika-inside-SOLR-vs-using-nutch-tp4089120.html Sent from the Solr - User mailing list archive at Nabble.com.