> There are some zip files inside the directory and have been addressed 
> to in the database. I'm thinking those are the one's it's jumping 
> right over.

With SOLR-7189, which should have kicked in for 5.1, Tika shouldn't skip over 
Zip files, it should process all the contents of those zips and concatenate the 
extracted text into one string.


-----Original Message-----
From: Shawn Heisey [mailto:apa...@elyograg.org]
Sent: Tuesday, July 21, 2015 10:41 AM
To: solr-user@lucene.apache.org
Subject: Re: Data Import Handler Stays Idle

On 7/21/2015 8:17 AM, Paden wrote:
> There are some zip files inside the directory and have been addressed 
> to in the database. I'm thinking those are the one's it's jumping 
> right over. They are not the issue. At least I'm 95% sure. And Shawn 
> if you're still watching I'm sorry I'm using solr-5.1.0.

Have you started Solr with a larger heap than the default 512MB in Solr 5.x?  
Tika can require a lot of memory.  I would have expected there to be 
OutOfMemoryError exceptions in the log if that were the problem, though.

You may need to use the "-m" option on the startup scripts to increase the max 
heap.  Starting with "-m 2g" would be a good idea.

Also, seeing the entire multi-line IOException from the log (which may be 
dozens of lines) could be important.

Thanks,
Shawn

Reply via email to