Re: Help Indexing Large File

2015-12-14 Thread Jack Krupansky
What is the nature of the file? Is it Solr XML, CSV, PDF (via Solr Cell), or... what? If a PDF, maybe it has lots of hi-resolution images. If so, you may need to strip out the images and just send the text, which would be a lot smaller. For example, you could run Tika locally to extract the text an

Re: Help Indexing Large File

2015-12-14 Thread Toke Eskildsen
Antelmo Aguilar wrote: > I am trying to index a very large file in Solr (around 5GB). However, I >get out of memory errors using Curl. I tried using the post script and I > had some success with it. After indexing several hundred thousand records > though, I got the following error message: Th

Re: Help Indexing Large File

2015-12-14 Thread Erick Erickson
Well, this usually means the maximum packet size has been exceeded, there are several possibilities here that I'm going to skip over because I have to ask the purpose of indexing a 5G file. Indexing such a huge file has several problems from a user's perspective: 1> assuming the bulk of it is text

Help Indexing Large File

2015-12-14 Thread Antelmo Aguilar
Hello, I am trying to index a very large file in Solr (around 5GB). However, I get out of memory errors using Curl. I tried using the post script and I had some success with it. After indexing several hundred thousand records though, I got the following error message: *SimplePostTool: FATAL: I