I'm using a combination of tika and custom code to extract text from files. (with solrj) I was looking at the amount of files I had in my index and noticed many of them where missing. Then I went to the solradmin panel and noticed this in the logfiles:
SEVERE SolrCore java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot commit SEVERE SolrDispatchFilter null:java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot commit SEVERE CommitTracker auto commit error...:java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot commit After this all the uploads to tika seem to fail.(internal server error 500). This is the code I use to upload stuff with Tika: SolrServer solr; ... public void IndexFile(File fileToIndex) throws IOException, SolrServerException { ContentStreamUpdateRequest up = new ContentStreamUpdateRequest("/update/extract"); up.addFile(fileToIndex, "application/octet-stream"); up.setParam("literal.filename", fileToIndex.getName()); up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); solr.request(up); } Is there a way to skip the file that caused the out of memory and then *continue extracting/indexing*. I don't know how to do this in SolrJ. All the files I uploaded manually kept working. (because I index each page of a pdf seperatly using pdfbox) Only those who used tika gave Exceptions and didn't commit. I know I could've increased memory parameters but some Excel files fail to extract even with 16Gb memory assigned. I've tested it with the tika library. -- View this message in context: http://lucene.472066.n3.nabble.com/Continue-committing-after-out-of-memory-of-contrib-library-tika-tp4063240.html Sent from the Solr - User mailing list archive at Nabble.com.