On 9/18/2013 3:40 AM, kowish.adamosh wrote: > I have a problem with data import (based on database sql) in Solr Cloud. I'm > trying to import ~500 000 000 of documents and I've created 30 logical > shards on 2 physical machines. Documents are distributed by composite id. > After some time (5-10 minutes; about 400 000 documents) Solr Cloud stops > indexing documents. This is because indexing thread parks and waits on > semaphore: > org.apache.solr.update.SolrCmdDistributor#semaphore.acquire() in method > submit.
There are some SolrCloud bugs that we expect will be fixed in version 4.5. Basically what happens is that when a large number of updates are being distributed from whichever core receives them to the appropriate shard replicas, managing all those requests results in a deadlock. If everything goes well with the release, 4.5 will be out sometime within the next two weeks. You can always download and build the "branches/lucene_solr_4_5" code branch from SVN if you want to try out what will become Solr 4.5: http://wiki.apache.org/solr/HowToContribute#Getting_the_source_code SOLR-4816 is semi-related, because it helps avoid the problem in the first place when using CloudSolrServer in a java program. I'm having a hard time finding the jira issue number(s) for the underlying problem(s), but I know some changes were committed recently specifically for this problem. Thanks, Shawn