hmm, loks like you are facing exactly the phenomena I asked about. See my question here: http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/61326
On Sun, Mar 4, 2012 at 9:24 PM, Markus Jelsma <markus.jel...@openindex.io> wrote: > Hi, > > With auto-committing disabled we can now index many millions of documents in > our test environment on a 5-node cluster with 5 shards and a replication > factor of 2. The documents are uploaded from map/reduce. No significant > changes were made to solrconfig and there are no update processors enabled. > We are using a trunk revision from this weekend. > > The indexing speed is well below what we are used to see, we can easily > index 5 millions documents on a non-cloud enabled Solr 3.x instance within > an hour. What could be going on? There aren't many open TCP connections and > the number of file descriptors is stable and I/O is low but CPU-time is > high! Each node has two Solr cores both writing to their dedicated disk. > > The indexing speed is stable, it was slow at start and still is. It's now > running for well over 6 hours and only 3.5 millions documents are indexed. > Another strange detail is that the node receiving all incoming documents > (we're not yet using a client side Solr server pool) has a much larger disk > usage than all other nodes. This is peculiar as we expected all replica's to > be a about the same size. > > The receiving node has slightly higher CPU than the other nodes but the > thread dump shows a very large amount of threads of type > cmdDistribExecutor-8-thread-292260 (295090) with 0-100ms CPU-time. At the > top of the list these threads all have < 20ms time but near the bottom it > rises to just over 100ms. All nodes have a couple of http-80-30 (121994) > threads with very high CPU-time each. > > Is this a known issue? Did i miss something? Any ideas? > > Thanks