> There are some zip files inside the directory and have been addressed > to in the database. I'm thinking those are the one's it's jumping > right over.
With SOLR-7189, which should have kicked in for 5.1, Tika shouldn't skip over Zip files, it should process all the contents of those zips and concatenate the extracted text into one string. -----Original Message----- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Tuesday, July 21, 2015 10:41 AM To: solr-user@lucene.apache.org Subject: Re: Data Import Handler Stays Idle On 7/21/2015 8:17 AM, Paden wrote: > There are some zip files inside the directory and have been addressed > to in the database. I'm thinking those are the one's it's jumping > right over. They are not the issue. At least I'm 95% sure. And Shawn > if you're still watching I'm sorry I'm using solr-5.1.0. Have you started Solr with a larger heap than the default 512MB in Solr 5.x? Tika can require a lot of memory. I would have expected there to be OutOfMemoryError exceptions in the log if that were the problem, though. You may need to use the "-m" option on the startup scripts to increase the max heap. Starting with "-m 2g" would be a good idea. Also, seeing the entire multi-line IOException from the log (which may be dozens of lines) could be important. Thanks, Shawn