I'm bringing this back up as requested. I'm currently simultaneously
in the "not replicating interesting things" and "has duplicate replicates
state". I think the stuff below shows the "not replicating" stuff.
Active tasks shows the other (these are based on replicator DB
documents (example below):
[
{
"checkpointed_source_seq": 2022317,
"continuous": true,
"doc_id": "cbstats-from-dogbowl",
"doc_write_failures": 0,
"docs_read": 300,
"docs_written": 300,
"missing_revisions_found": 300,
"pid": "<0.10466.12>",
"progress": 100,
"replication_id": "50daecd0a29f4b7e5d102990831f3d64+continuous",
"revisions_checked": 304,
"source": "http://dustin:*****@single.couchbase.net/cbstats/",
"source_seq": 2022317,
"started_on": 1349309457,
"target": "cbstats",
"type": "replication",
"updated_on": 1349310442
},
{
"checkpointed_source_seq": 2022317,
"continuous": true,
"doc_id": "cbstats-from-dogbowl",
"doc_write_failures": 0,
"docs_read": 62,
"docs_written": 62,
"missing_revisions_found": 62,
"pid": "<0.11019.12>",
"progress": 100,
"replication_id": "411e341d5aa9a3fe636cf4ea8ba71720+continuous",
"revisions_checked": 304,
"source": "http://dustin:*****@single.couchbase.net/cbstats/",
"source_seq": 2022317,
"started_on": 1349309471,
"target": "cbstats",
"type": "replication",
"updated_on": 1349310443
},
{
"checkpointed_source_seq": 107068,
"continuous": true,
"doc_id": "gerrit-from-prod",
"doc_write_failures": 0,
"docs_read": 22,
"docs_written": 22,
"missing_revisions_found": 22,
"pid": "<0.11086.12>",
"progress": 100,
"replication_id": "4a21031dac0d81637a23c32bad620be9+continuous",
"revisions_checked": 26,
"source": "http://dustinphoto.iriscouch.com/gerrit/",
"source_seq": 107068,
"started_on": 1349309487,
"target": "gerrit",
"type": "replication",
"updated_on": 1349310445
},
{
"checkpointed_source_seq": 107068,
"continuous": true,
"doc_id": "gerrit-from-prod",
"doc_write_failures": 0,
"docs_read": 17,
"docs_written": 17,
"missing_revisions_found": 17,
"pid": "<0.11107.12>",
"progress": 100,
"replication_id": "b4ad5d3f2e5b78670e4c8364b18000e9+continuous",
"revisions_checked": 26,
"source": "http://dustinphoto.iriscouch.com/gerrit/",
"source_seq": 107068,
"started_on": 1349309488,
"target": "gerrit",
"type": "replication",
"updated_on": 1349310445
}
]
The replicator document for the latter, for example is this:
{
"_id": "gerrit-from-prod",
"_rev": "2235-36de10fb757581a1782dacbb26ee4809",
"source": "http://dustinphoto.iriscouch.com/gerrit",
"target": "gerrit",
"continuous": true,
"user_ctx": {
"roles": [
"_admin"
]
},
"_replication_state_time": "2012-10-03T17:11:27-07:00",
"_replication_id": "b4ad5d3f2e5b78670e4c8364b18000e9",
"_replication_state": "triggered"
}
Begin forwarded message:
> From: Dustin Sallings <[email protected]>
> Subject: Re: replication problems
> Date: June 15, 2012 0:10:04 PDT
> To: [email protected]
> Reply-To: [email protected]
>
>
> On Jun 14, 2012, at 11:28 PM, Benoit Chesneau wrote:
>
>> Ar you using _replicate or _replicator ? Anything interresting in logs?
>
>
> I'm using _replicator (wonderful feature, I just kill the DB and
> everything goes back the way I want it).
>
> Hmm... I do think I found some stuff digging through the logs. This
> is the local DB I noticed not doing its thing, although there were tons of
> errors all around this. Looks like the server got into some kind of bad
> state and sort of half-crashed.
>
>
> [Thu, 14 Jun 2012 23:20:12 GMT] [error] [<0.133.0>] Replication
> `ae601df0373da82d1b4a9ff741c8ba18+continuous` (`rpics` -> `rpics-processed`)
> failed: {{timeout,{gen_server,call,[<0.213.0>,{open_ref_count,<0.4
> 42.0>}]}},
> {gen_server,call,
> [couch_server,
> {open,<<"rpics">>,
> [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}]},
> infinity]}}
> [Thu, 14 Jun 2012 23:20:25 GMT] [error] [<0.383.0>] ** Generic server
> <0.383.0> terminating
> ** Last message in was {'EXIT',<0.384.0>,
> {{timeout,
> {gen_server,call,
> [<0.213.0>,{open_ref_count,<0.442.0>}]}},
> {gen_server,call,
> [couch_server,
> {open,<<"cbstats">>,
> [{user_ctx,
> {user_ctx,null,[<<"_admin">>],undefined}},
> {user_ctx,
> {user_ctx,null,[<<"_admin">>],undefined}}]},
> infinity]}}}
>
> ** When Server state == {state,<0.272.0>,<0.384.0>,20,
> {httpdb,
>
> "http://dustin:[email protected]/cbstats/",
> nil,
> [{"Accept","application/json"},
> {"User-Agent","CouchDB/1.2.0"}],
> 30000,
> [{socket_options,
> [{keepalive,true},{nodelay,false}]}],
> 10,250,<0.273.0>,20},
> {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,
> <0.290.0>,<0.286.0>,<0.367.0>,
> {db_header,6,984356,0,
> {860345646,{737369,975,640891414},59433736},
> {860348005,738344,42056446},
> {860352635,[],5737},
> 0,nil,nil,1000},
> 984356,
> {btree,<0.286.0>,
> {860345646,{737369,975,640891414},59433736},
> #Fun<couch_db_updater.10.57960608>,
> #Fun<couch_db_updater.11.57960608>,
> #Fun<couch_btree.5.133731799>,
> #Fun<couch_db_updater.12.57960608>,snappy},
> {btree,<0.286.0>,
> {860348005,738344,42056446},
> #Fun<couch_db_updater.13.57960608>,
> #Fun<couch_db_updater.14.57960608>,
> #Fun<couch_btree.5.133731799>,
> #Fun<couch_db_updater.15.57960608>,snappy},
> {btree,<0.286.0>,
> {860352635,[],5737},
> #Fun<couch_btree.3.133731799>,
> #Fun<couch_btree.4.133731799>,
> #Fun<couch_btree.5.133731799>,nil,snappy},
> 984356,<<"cbstats">>,
> "/Volumes/terror/db/couchdb/cbstats.couch",[],[],
> nil,
> {user_ctx,null,[<<"_admin">>],undefined},
> nil,1000,
> [before_header,after_header,on_file_open],
> [{user_ctx,
> {user_ctx,null,[<<"_admin">>],undefined}}],
> snappy,nil,nil},
> [],nil,nil,nil,
> {rep_stats,0,0,0,0,0},
> nil,<0.385.0>,
> {batch,[],0}}
> ** Reason for termination ==
> ** {noproc,{gen_server,call,[<0.367.0>,{drop,<0.383.0>},infinity]}}
>
>
>
>
> Scrolling to the beginning of the errors, I find this:
>
>
> [Thu, 14 Jun 2012 23:15:54 GMT] [error] [<0.164.0>] Replication
> `543f76281e8d52d6ce5b51fddf0588e7+continuous` (`photo` ->
> `http://dustin:*****@dustinphoto.couchone.com/photo/`) failed: source_db_down
> [Thu, 14 Jun 2012 23:18:57 GMT] [info] [<0.358.0>] 127.0.0.1 - - GET
> /_all_dbs 200
> [Thu, 14 Jun 2012 23:19:52 GMT] [error] [<0.289.0>] ** Generic server
> <0.289.0> terminating
> ** Last message in was {update_docs,<0.272.0>,[],
> [{{doc,
> <<"_local/c4cc070f896d7267e52ba012856fed4b">>,
> {0,[<<"346185">>]},
> {[{<<"session_id">>,
> <<"9fb3475683d44bb1e151031dd42cc59f">>},
> {<<"source_last_seq">>,1419004},
> {<<"replication_id_version">>,2},
> {<<"history">>,
> [{[{<<"session_id">>,
>
> <<"9fb3475683d44bb1e151031dd42cc59f">>},
> {<<"start_time">>,
> <<"Thu, 14 Jun 2012 01:35:02 GMT">>},
> {<<"end_time">>,
> <<"Thu, 14 Jun 2012 23:15:29 GMT">>},
> {<<"start_last_seq">>,1410146},
> {<<"end_last_seq">>,1419004},
> {<<"recorded_seq">>,1419004},
> {<<"missing_checked">>,8100},
> {<<"missing_found">>,8100},
> {<<"docs_read">>,8100},
> {<<"docs_written">>,8100},
> {<<"doc_write_failures">>,0}]},
> {[{<<"session_id">>,
>
> <<"3edd7c50327eab7ec0768451e34efa8b">>},
> {<<"start_time">>,
> <<"Tue, 12 Jun 2012 05:51:17 GMT">>},
> {<<"end_time">>,
> <<"Tue, 12 Jun 2012 13:02:37 GMT">>},
> {<<"start_last_seq">>,1407186},
> {<<"end_last_seq">>,1410146},
> {<<"recorded_seq">>,1410146},
> {<<"missing_checked">>,2583},
> {<<"missing_found">>,2577},
> {<<"docs_read">>,2577},
> {<<"docs_written">>,2577},
> {<<"doc_write_failures">>,0}]},
> {[{<<"session_id">>,
>
> <<"172de62044281a01b1584a9d099f42af">>},
> {<<"start_time">>,
> <<"Mon, 11 Jun 2012 03:40:11 GMT">>},
> {<<"end_time">>,
> <<"Mon, 11 Jun 2012 15:16:24 GMT">>},
> {<<"start_last_seq">>,1405428},
> {<<"end_last_seq">>,1407186},
> {<<"recorded_seq">>,1407186},
> {<<"missing_checked">>,1721},
> {<<"missing_found">>,1721},
> {<<"docs_read">>,1721},
> {<<"docs_written">>,1721},
> {<<"doc_write_failures">>,0}]},
> {[{<<"session_id">>,
>
> <<"e60a126a2036c5fab00a1249101820c8">>},
> {<<"start_time">>,
> <<"Sat, 09 Jun 2012 07:47:22 GMT">>},
> {<<"end_time">>,
> <<"Sun, 10 Jun 2012 21:16:20 GMT">>},
> {<<"start_last_seq">>,1386289},
> {<<"end_last_seq">>,1405428},
> {<<"recorded_seq">>,1405428},
> {<<"missing_checked">>,16977},
> {<<"missing_found">>,16977},
> {<<"docs_read">>,16977},
> {<<"docs_written">>,16977},
> {<<"doc_write_failures">>,0}]},
> {[{<<"session_id">>,
>
> <<"ef3e4333d340dcf73ddfa3fe8c720042">>},
> {<<"start_time">>,
> <<"Mon, 04 Jun 2012 02:39:44 GMT">>},
> {<<"end_time">>,
> <<"Mon, 04 Jun 2012 12:35:50 GMT">>},
> {<<"start_last_seq">>,1384738},
> {<<"end_last_seq">>,1386289},
> {<<"recorded_seq">>,1386289},
> {<<"missing_checked">>,1551},
> {<<"missing_found">>,1550},
> {<<"docs_read">>,1550},
> {<<"docs_written">>,1550},
> {<<"doc_write_failures">>,0}]},
> {[{<<"session_id">>,
>
> <<"d5123a3caf462794aaf5a47be1bb3b6e">>},
> {<<"start_time">>,
> <<"Wed, 30 May 2012 20:41:43 GMT">>},
> {<<"end_time">>,
> <<"Mon, 04 Jun 2012 02:37:33 GMT">>},
> {<<"start_last_seq">>,1372404},
> {<<"end_last_seq">>,1384738},
> {<<"recorded_seq">>,1384738},
> {<<"missing_checked">>,12334},
> {<<"missing_found">>,12333},
> {<<"docs_read">>,12333},
> {<<"docs_written">>,12333},
> {<<"doc_write_failures">>,0}]},
> {[{<<"session_id">>,
>
> <<"52a16e8832f70dc094f6fff5e9b7d75b">>},
> {<<"start_time">>,
> <<"Sun, 27 May 2012 23:36:41 GMT">>},
> {<<"end_time">>,
> <<"Wed, 30 May 2012 20:40:14 GMT">>},
> {<<"start_last_seq">>,1361049},
> {<<"end_last_seq">>,1372404},
> {<<"recorded_seq">>,1372404},
> {<<"missing_checked">>,11355},
> {<<"missing_found">>,11355},
> {<<"docs_read">>,11355},
> {<<"docs_written">>,11355},
> {<<"doc_write_failures">>,0}]},
> [...lots of these...]
>
> [],false,[]},
> #Ref<0.0.15.159973>}],
> false,false}
> ** When Server state == {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,
> <0.290.0>,<0.286.0>,<0.367.0>,
> {db_header,6,992456,0,
> {943280145,{744250,975,647546641},60017672},
> {943282327,745225,42485979},
> {943267963,[],5753},
> 0,nil,nil,1000},
> 992456,
> {btree,<0.286.0>,
> {943280145,{744250,975,647546641},60017672},
> #Fun<couch_db_updater.10.57960608>,
> #Fun<couch_db_updater.11.57960608>,
> #Fun<couch_btree.5.133731799>,
> #Fun<couch_db_updater.12.57960608>,snappy},
> {btree,<0.286.0>,
> {943282327,745225,42485979},
> #Fun<couch_db_updater.13.57960608>,
> #Fun<couch_db_updater.14.57960608>,
> #Fun<couch_btree.5.133731799>,
> #Fun<couch_db_updater.15.57960608>,snappy},
> {btree,<0.286.0>,
> {943267963,[],5753},
> #Fun<couch_btree.3.133731799>,
> #Fun<couch_btree.4.133731799>,
> #Fun<couch_btree.5.133731799>,nil,snappy},
> 992456,<<"cbstats">>,
> "/Volumes/terror/db/couchdb/cbstats.couch",[],[],
> nil,
> {user_ctx,null,[],undefined},
> nil,1000,
> [before_header,after_header,on_file_open],
> [{user_ctx,
> {user_ctx,null,[<<"_admin">>],undefined}}],
> snappy,nil,nil}
> ** Reason for termination ==
> ** {timeout,
> {gen_server,call,
> [<0.288.0>,
> {db_updated,
> {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,<0.290.0>,
> <0.286.0>,<0.367.0>,
> {db_header,6,992456,0,
> {943280145,{744250,975,647546641},60017672},
> {943282327,745225,42485979},
> {943267963,[],5753},
> 0,nil,nil,1000},
> 992456,
> {btree,<0.286.0>,
> {943280145,{744250,975,647546641},60017672},
> #Fun<couch_db_updater.10.57960608>,
> #Fun<couch_db_updater.11.57960608>,
> #Fun<couch_btree.5.133731799>,
> #Fun<couch_db_updater.12.57960608>,snappy},
> {btree,<0.286.0>,
> {943282327,745225,42485979},
> #Fun<couch_db_updater.13.57960608>,
> #Fun<couch_db_updater.14.57960608>,
> #Fun<couch_btree.5.133731799>,
> #Fun<couch_db_updater.15.57960608>,snappy},
> {btree,<0.286.0>,
> {943284347,[],5756},
> #Fun<couch_btree.3.133731799>,
> #Fun<couch_btree.4.133731799>,
> #Fun<couch_btree.5.133731799>,nil,snappy},
> 992456,<<"cbstats">>,
> "/Volumes/terror/db/couchdb/cbstats.couch",[],[],nil,
> {user_ctx,null,[],undefined},
> #Ref<0.0.15.160107>,1000,
> [before_header,after_header,on_file_open],
> [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}],
> snappy,nil,nil}}]}}
>
>
>
>
> --
> dustin sallings
>
>
>
--
dustin sallings