Re: [SoldCloud] Slow indexing

eks dev Sun, 04 Mar 2012 13:34:49 -0800

hmm, loks like you are facing exactly the phenomena I asked about.
See my question here:
http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/61326


On Sun, Mar 4, 2012 at 9:24 PM, Markus Jelsma
<markus.jel...@openindex.io> wrote:
> Hi,
>
> With auto-committing disabled we can now index many millions of documents in
> our test environment on a 5-node cluster with 5 shards and a replication
> factor of 2. The documents are uploaded from map/reduce. No significant
> changes were made to solrconfig and there are no update processors enabled.
> We are using a trunk revision from this weekend.
>
> The indexing speed is well below what we are used to see, we can easily
> index 5 millions documents on a non-cloud enabled Solr 3.x instance within
> an hour. What could be going on? There aren't many open TCP connections and
> the number of file descriptors is stable and I/O is low but CPU-time is
> high! Each node has two Solr cores both writing to their dedicated disk.
>
> The indexing speed is stable, it was slow at start and still is. It's now
> running for well over 6 hours and only 3.5 millions documents are indexed.
> Another strange detail is that the node receiving all incoming documents
> (we're not yet using a client side Solr server pool) has a much larger disk
> usage than all other nodes. This is peculiar as we expected all replica's to
> be a about the same size.
>
> The receiving node has slightly higher CPU than the other nodes but the
> thread dump shows a very large amount of threads of type
> cmdDistribExecutor-8-thread-292260 (295090) with 0-100ms CPU-time. At the
> top of the list these threads all have < 20ms time but near the bottom it
> rises to just over 100ms. All nodes have a couple of http-80-30 (121994)
> threads with very high CPU-time each.
>
> Is this a known issue? Did i miss something? Any ideas?
>
> Thanks

Re: [SoldCloud] Slow indexing

Reply via email to