Antelmo Aguilar <antelmo.aguilar...@nd.edu> wrote:
> I am trying to index a very large file in Solr (around 5GB).  However, I
>get out of memory errors using Curl.  I tried using the post script and I
> had some success with it.  After indexing several hundred thousand records
> though, I got the following error message:

This indicates that your file contains a lot of documents. The solution is to 
create smaller files and send more of them. Maybe a few hundred MB, to keep it 
manageable?

> *SimplePostTool: FATAL: IOException while posting data:
> java.io.IOException: too many bytes written*

A look in the postData method in SimplePostTool (at least for Solr 4.10, which 
is what my editor had open) reveals that it takes the length of the file as an 
Integer, which overflows when the file is more than 2GB. This means the 
HttpUrlComponent, that is used for posting, gets the wrong expected size and 
throws the exception when that is exceeded.

A real fix (if it is not already in Solr 5) would be to fail fast if the file 
is larger than Integer.MAX_VALUE.

- Toke Eskildsen

Reply via email to