Hello,
I am receiving OutOfMemoryError during indexing, and after investigating the 
heap dump, I am still missing some information, and I thought this might be a 
good place for help.

I am using Solr 4.0 beta, and I have 5 threads that send update requests to 
Solr. Each request is a bulk of 100 SolrInputDocuments (using solrj), and my 
goal is to index around 2.5 million documents.
Solr is configured to do a hard-commit every 10 seconds, so initially I thought 
that it can only accumulate in memory 10 seconds worth of updates, but that's 
not the case. I can see in a profiler how it accumulates memory over time, even 
with 4 to 6 GB of memory. It is also configured to optimize with mergeFactor=10.

At first I thought that optimization is a blocking, synchronous operation. It 
is, in the sense that the index can't be updated during optimization. However, 
it is not synchronous, in the sense that the update request coming from my code 
is not blocked - Solr just returns an OK response, even while the index is 
optimizing.
This indicates that Solr has an internal queue of inbound requests, and that 
the OK response just means that it is in the queue. I get confirmation for this 
from a friend who is a Solr expert (or so I hope).

My main question is: how can I put a bound on this internal queue, and make 
update requests synchronous in case the queue is full? Put it another way, I 
need to know if Solr is really ready to receive more requests, so I don't 
overload it and cause OOME.

I performed several tests, with slow and fast disks, and on the really fasts 
disk the problem didn't occur. However, I can't demand such fast disk from all 
the clients, and also even with a fast disk the problem will occur eventually 
when I try to index 10 million documents.
I also tried to perform indexing with optimization disabled, but it didn't help.

Thanks,
Yoni

Confidentiality: This communication and any attachments are intended for the 
above-named persons only and may be confidential and/or legally privileged. Any 
opinions expressed in this communication are not necessarily those of NICE 
Actimize. If this communication has come to you in error you must take no 
action based on it, nor must you copy or show it to anyone; please 
delete/destroy and inform the sender by e-mail immediately.  
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and 
attachments are free from any virus, we advise that in keeping with good 
computing practice the recipient should ensure they are actually virus free.

Reply via email to