I've discovered some documents are 100+MB in size.. Could this be the problem?
On Thu, Apr 19, 2012 at 3:49 PM, Bram Rongen <m...@bramrongen.nl> wrote: > Hello Shawn, > > Thanks very much for your answer. > > Yesterday I've started indexing again but this time on Solr 3.6.. Again > Solr is failing around the same time, but not exactly (now the largest fdt > file is 4.8G).. It's right after the moment I receive memory-errors at the > Drupal side which make me suspicious that it maybe has something to do with > a huge document.. Is that possible? I was indexing 1500 documents at once > every minute. Drupal builds them all up in memory before submitting them to > Solr. At some point it runs out of memory and I have to switch to 10/20 > documents per minute for a while.. then I can switch back to 1000 documents > per minute. > > The disk is a software RAID1 over 2 disks. But I've also run into the same > problem at another server.. This was a VM-server with only 1GB ram and 40GB > of disk. With this server the merge-repeat happened at an earlier stage. > > I've also let Solr continue with merging for about two days before (in an > earlier attempt), without submitting new documents. The merging kept > repeating. > > Somebody suggested it could be because I'm using Jetty, could that be > right? > > My schema.xml and solrconfig.xml can be found here: > http://pastebin.com/GeBrB903 http://pastebin.com/Su8q1WAh > > Kind regards, > Bram Rongen > > > On Wed, Apr 18, 2012 at 10:54 PM, Shawn Heisey <s...@elyograg.org> wrote: > >> On 4/18/2012 6:17 AM, Bram Rongen wrote: >> >>> I've been using Solr for a very short time now and I'm stuck. I'm trying >>> to >>> index a drupal website consisting of 1.2 million smaller nodes and 300k >>> larger nodes (~400kb avg).. >>> >> >> A followup to my previous reply: Your ramBufferSizeMB is only 32, the >> default in the example config. I have seen recommendations indicating that >> going beyond 128MB is not usually helpful. With such large input >> documents, that may not apply to you - try setting it to 512 or 1024. That >> will result in far fewer index segments being created. They will be >> larger, so merges will be much less frequent but take longer. >> >> Thanks, >> Shawn >> >> >