Antelmo Aguilar <antelmo.aguilar...@nd.edu> wrote: > I am trying to index a very large file in Solr (around 5GB). However, I >get out of memory errors using Curl. I tried using the post script and I > had some success with it. After indexing several hundred thousand records > though, I got the following error message:
This indicates that your file contains a lot of documents. The solution is to create smaller files and send more of them. Maybe a few hundred MB, to keep it manageable? > *SimplePostTool: FATAL: IOException while posting data: > java.io.IOException: too many bytes written* A look in the postData method in SimplePostTool (at least for Solr 4.10, which is what my editor had open) reveals that it takes the length of the file as an Integer, which overflows when the file is more than 2GB. This means the HttpUrlComponent, that is used for posting, gets the wrong expected size and throws the exception when that is exceeded. A real fix (if it is not already in Solr 5) would be to fail fast if the file is larger than Integer.MAX_VALUE. - Toke Eskildsen