So this is only one slave that hangs up and not the master? Can you get thread dumps on both the master and the slave during a hang?
-Yonik http://www.lucidimagination.com On Mon, Mar 23, 2009 at 10:44 AM, Jeff Newburn <jnewb...@zappos.com> wrote: > We are having an intermittent problem with replication. We reindex nightly > which usually means there are 2 commits during replication then a final > commit/optimize at the end. For some reason the replication will hang > occasionally with the following screenshot. This is frustrating as it will > completely stall out any further replications. Additionally, it seems to > only happen on reindex and it will strike 1 server randomly but not always > the same server. > > > In case the screen shot doesn’t come through: > > Master http://10.66.209.38:8080/solr/zeta-main/replication > Latest Index Version:1233423827699, Generation: 6237 > Replicatable Index Version:0, Generation: 0 > Poll Interval 00:05:00 > Local Index Index Version: 1233423827684, Generation: 6222 > Location: /opt/solr-data/zeta-main/index > Size: 1.29 GB > Times Replicated Since Startup: 3591 > Previous Replication Done At: Mon Mar 23 00:18:03 PDT 2009 > Config Files Replicated At: Wed Mar 18 06:07:53 PDT 2009 > Config Files Replicated: [synonyms.txt] > Times Config Files Replicated Since Startup: 4 > Next Replication Cycle At: Mon Mar 23 00:27:55 PDT 2009 > Current Replication Status Start Time: Mon Mar 23 00:22:55 PDT 2009 > Files Downloaded: 12 / 163 > Downloaded: 4.12 MB / 1.41 GB [0.0%] > Downloading File: _5no.tis, Downloaded: 0 bytes / 629.57 KB [0.0%] > Time Elapsed: 26371s, Estimated Time Remaining: 9216278s, Speed: 163 > bytes/s > > > > -- > Jeff Newburn > Software Engineer, Zappos.com > jnewb...@zappos.com - 702-943-7562 >