: It wasn't just a single file, it was dozens of files all having problems 
: toward the end just before I killed the process.
        ...
: That is by no means all the errors, that is just a sample of a few.  
: You can see they all threw HTTP 500 errors.  What is strange is, nearly 
: every file succeeded before about the 2200-files-mark, and nearly every 
: file after that failed.

..the root question is: do those files *only* fail if you have already 
indexed ~2200 files, or do they fail if you start up your server and index 
them first?

there may be a resource issued (if it only happens after indexing 2200) or 
it may just be a problem with a large number of your PDFs that your 
iteration code just happens to get to at that point.

If it's the former, then there may e something buggy about how Solr is 
using Tika to cause the problem -- if it's the later, then it's a straight 
Tika parsing issue.

: > now, commit is set to false to speed up the indexing, and I'm assuming that
: > Solr should be auto-committing as necessary.  I'm using the default
: > solrconfig.xml file included in apache-solr-1.4.1\example\solr\conf.  Once

solr does no autocommitting by default, you need to check your 
solrconfig.xml


-Hoss

Reply via email to