tank you shawn ; but if I use solarium client PHP for the production what I have to do in this case.
2016-03-25 13:44 GMT+00:00 Shawn Heisey <apa...@elyograg.org>: > On 3/25/2016 5:44 AM, Moncif Aidi wrote: > > Im Using solr 5.4.1 for indexing thousands of documents, and it works > > perfectly.The issue comes when some documents are not well formatted or > > contains some special characters and it makes solr hangs or blocked on > some > > perticular documents and it gives these errors when viewing the log : > > i want to detect what files are causing these problems, or at least point > > me to some library Im missing. Thanks in advance > > Tika is known for problems like this, particularly with PDF and > Microsoft Office documents. > > This is one of the hazards of running with the Tika application built > into Solr's Extracting Request Handler. You can't get any good > information out of Solr about what went wrong, and any severe problems > with Tika might actually cause Solr to completely crash. > > If you're going to use Tika for production indexing, you should write a > Java program using SolrJ and Tika so that you are in complete control, > and so Solr isn't unstable. > > Thanks, > Shawn > >