I'll note here that the attached patch is wrong. It uses a single uuid from the node running replication, which might not be the source or target. Instead, the uuid of source and target must be retrieved and used instead of the host:port. Jason's suggestion to add the uuid (stored in the ini file) to the welcome message sounds really good to me.
Can't attach this to the ticket today as I don't have my Jira creds. Sent from the ocean floor On 10 Oct 2012, at 21:40, Jan Lehnardt <[email protected]> wrote: > flagged. > > On Oct 10, 2012, at 22:34 , Robert Newson <[email protected]> wrote: > >> Jan, >> >> Flag that as fix-for 1.3? I don't have my creds on my phone to do it. >> >> I like the ini uuid idea best, modelled after the cookie with secret. >> If we have the uuid, we'd omit host name as well as port, right? >> >> Sent from the ocean floor >> >> On 10 Oct 2012, at 21:12, Jan Lehnardt <[email protected]> wrote: >> >>> Filipe tells me this is https://issues.apache.org/jira/browse/COUCHDB-1259 >>> >>> Cheers >>> Jan >>> -- >>> >>> On Oct 4, 2012, at 02:28 , Dustin Sallings <[email protected]> wrote: >>> >>>> >>>> I'm bringing this back up as requested. I'm currently simultaneously in >>>> the "not replicating interesting things" and "has duplicate replicates >>>> state". I think the stuff below shows the "not replicating" stuff. >>>> >>>> Active tasks shows the other (these are based on replicator DB documents >>>> (example below): >>>> >>>> [ >>>> { >>>> "checkpointed_source_seq": 2022317, >>>> "continuous": true, >>>> "doc_id": "cbstats-from-dogbowl", >>>> "doc_write_failures": 0, >>>> "docs_read": 300, >>>> "docs_written": 300, >>>> "missing_revisions_found": 300, >>>> "pid": "<0.10466.12>", >>>> "progress": 100, >>>> "replication_id": "50daecd0a29f4b7e5d102990831f3d64+continuous", >>>> "revisions_checked": 304, >>>> "source": "http://dustin:*****@single.couchbase.net/cbstats/", >>>> "source_seq": 2022317, >>>> "started_on": 1349309457, >>>> "target": "cbstats", >>>> "type": "replication", >>>> "updated_on": 1349310442 >>>> }, >>>> { >>>> "checkpointed_source_seq": 2022317, >>>> "continuous": true, >>>> "doc_id": "cbstats-from-dogbowl", >>>> "doc_write_failures": 0, >>>> "docs_read": 62, >>>> "docs_written": 62, >>>> "missing_revisions_found": 62, >>>> "pid": "<0.11019.12>", >>>> "progress": 100, >>>> "replication_id": "411e341d5aa9a3fe636cf4ea8ba71720+continuous", >>>> "revisions_checked": 304, >>>> "source": "http://dustin:*****@single.couchbase.net/cbstats/", >>>> "source_seq": 2022317, >>>> "started_on": 1349309471, >>>> "target": "cbstats", >>>> "type": "replication", >>>> "updated_on": 1349310443 >>>> }, >>>> { >>>> "checkpointed_source_seq": 107068, >>>> "continuous": true, >>>> "doc_id": "gerrit-from-prod", >>>> "doc_write_failures": 0, >>>> "docs_read": 22, >>>> "docs_written": 22, >>>> "missing_revisions_found": 22, >>>> "pid": "<0.11086.12>", >>>> "progress": 100, >>>> "replication_id": "4a21031dac0d81637a23c32bad620be9+continuous", >>>> "revisions_checked": 26, >>>> "source": "http://dustinphoto.iriscouch.com/gerrit/", >>>> "source_seq": 107068, >>>> "started_on": 1349309487, >>>> "target": "gerrit", >>>> "type": "replication", >>>> "updated_on": 1349310445 >>>> }, >>>> { >>>> "checkpointed_source_seq": 107068, >>>> "continuous": true, >>>> "doc_id": "gerrit-from-prod", >>>> "doc_write_failures": 0, >>>> "docs_read": 17, >>>> "docs_written": 17, >>>> "missing_revisions_found": 17, >>>> "pid": "<0.11107.12>", >>>> "progress": 100, >>>> "replication_id": "b4ad5d3f2e5b78670e4c8364b18000e9+continuous", >>>> "revisions_checked": 26, >>>> "source": "http://dustinphoto.iriscouch.com/gerrit/", >>>> "source_seq": 107068, >>>> "started_on": 1349309488, >>>> "target": "gerrit", >>>> "type": "replication", >>>> "updated_on": 1349310445 >>>> } >>>> ] >>>> >>>> >>>> The replicator document for the latter, for example is this: >>>> >>>> { >>>> "_id": "gerrit-from-prod", >>>> "_rev": "2235-36de10fb757581a1782dacbb26ee4809", >>>> "source": "http://dustinphoto.iriscouch.com/gerrit", >>>> "target": "gerrit", >>>> "continuous": true, >>>> "user_ctx": { >>>> "roles": [ >>>> "_admin" >>>> ] >>>> }, >>>> "_replication_state_time": "2012-10-03T17:11:27-07:00", >>>> "_replication_id": "b4ad5d3f2e5b78670e4c8364b18000e9", >>>> "_replication_state": "triggered" >>>> } >>>> >>>> >>>> Begin forwarded message: >>>> >>>>> From: Dustin Sallings <[email protected]> >>>>> Subject: Re: replication problems >>>>> Date: June 15, 2012 0:10:04 PDT >>>>> To: [email protected] >>>>> Reply-To: [email protected] >>>>> >>>>> >>>>> On Jun 14, 2012, at 11:28 PM, Benoit Chesneau wrote: >>>>> >>>>>> Ar you using _replicate or _replicator ? Anything interresting in logs? >>>>> >>>>> >>>>> I'm using _replicator (wonderful feature, I just kill the DB and >>>>> everything goes back the way I want it). >>>>> >>>>> Hmm... I do think I found some stuff digging through the logs. This is >>>>> the local DB I noticed not doing its thing, although there were tons of >>>>> errors all around this. Looks like the server got into some kind of bad >>>>> state and sort of half-crashed. >>>>> >>>>> >>>>> [Thu, 14 Jun 2012 23:20:12 GMT] [error] [<0.133.0>] Replication >>>>> `ae601df0373da82d1b4a9ff741c8ba18+continuous` (`rpics` -> >>>>> `rpics-processed`) failed: >>>>> {{timeout,{gen_server,call,[<0.213.0>,{open_ref_count,<0.4 >>>>> 42.0>}]}}, >>>>> {gen_server,call, >>>>> [couch_server, >>>>> {open,<<"rpics">>, >>>>> [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}]}, >>>>> infinity]}} >>>>> [Thu, 14 Jun 2012 23:20:25 GMT] [error] [<0.383.0>] ** Generic server >>>>> <0.383.0> terminating >>>>> ** Last message in was {'EXIT',<0.384.0>, >>>>> {{timeout, >>>>> {gen_server,call, >>>>> [<0.213.0>,{open_ref_count,<0.442.0>}]}}, >>>>> {gen_server,call, >>>>> [couch_server, >>>>> {open,<<"cbstats">>, >>>>> [{user_ctx, >>>>> {user_ctx,null,[<<"_admin">>],undefined}}, >>>>> {user_ctx, >>>>> {user_ctx,null,[<<"_admin">>],undefined}}]}, >>>>> infinity]}}} >>>>> >>>>> ** When Server state == {state,<0.272.0>,<0.384.0>,20, >>>>> {httpdb, >>>>> >>>>> "http://dustin:[email protected]/cbstats/", >>>>> nil, >>>>> [{"Accept","application/json"}, >>>>> {"User-Agent","CouchDB/1.2.0"}], >>>>> 30000, >>>>> [{socket_options, >>>>> [{keepalive,true},{nodelay,false}]}], >>>>> 10,250,<0.273.0>,20}, >>>>> {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>, >>>>> <0.290.0>,<0.286.0>,<0.367.0>, >>>>> {db_header,6,984356,0, >>>>> {860345646,{737369,975,640891414},59433736}, >>>>> {860348005,738344,42056446}, >>>>> {860352635,[],5737}, >>>>> 0,nil,nil,1000}, >>>>> 984356, >>>>> {btree,<0.286.0>, >>>>> {860345646,{737369,975,640891414},59433736}, >>>>> #Fun<couch_db_updater.10.57960608>, >>>>> #Fun<couch_db_updater.11.57960608>, >>>>> #Fun<couch_btree.5.133731799>, >>>>> #Fun<couch_db_updater.12.57960608>,snappy}, >>>>> {btree,<0.286.0>, >>>>> {860348005,738344,42056446}, >>>>> #Fun<couch_db_updater.13.57960608>, >>>>> #Fun<couch_db_updater.14.57960608>, >>>>> #Fun<couch_btree.5.133731799>, >>>>> #Fun<couch_db_updater.15.57960608>,snappy}, >>>>> {btree,<0.286.0>, >>>>> {860352635,[],5737}, >>>>> #Fun<couch_btree.3.133731799>, >>>>> #Fun<couch_btree.4.133731799>, >>>>> #Fun<couch_btree.5.133731799>,nil,snappy}, >>>>> 984356,<<"cbstats">>, >>>>> "/Volumes/terror/db/couchdb/cbstats.couch",[],[], >>>>> nil, >>>>> {user_ctx,null,[<<"_admin">>],undefined}, >>>>> nil,1000, >>>>> [before_header,after_header,on_file_open], >>>>> [{user_ctx, >>>>> {user_ctx,null,[<<"_admin">>],undefined}}], >>>>> snappy,nil,nil}, >>>>> [],nil,nil,nil, >>>>> {rep_stats,0,0,0,0,0}, >>>>> nil,<0.385.0>, >>>>> {batch,[],0}} >>>>> ** Reason for termination == >>>>> ** {noproc,{gen_server,call,[<0.367.0>,{drop,<0.383.0>},infinity]}} >>>>> >>>>> >>>>> >>>>> >>>>> Scrolling to the beginning of the errors, I find this: >>>>> >>>>> >>>>> [Thu, 14 Jun 2012 23:15:54 GMT] [error] [<0.164.0>] Replication >>>>> `543f76281e8d52d6ce5b51fddf0588e7+continuous` (`photo` -> >>>>> `http://dustin:*****@dustinphoto.couchone.com/photo/`) failed: >>>>> source_db_down >>>>> [Thu, 14 Jun 2012 23:18:57 GMT] [info] [<0.358.0>] 127.0.0.1 - - GET >>>>> /_all_dbs 200 >>>>> [Thu, 14 Jun 2012 23:19:52 GMT] [error] [<0.289.0>] ** Generic server >>>>> <0.289.0> terminating >>>>> ** Last message in was {update_docs,<0.272.0>,[], >>>>> [{{doc, >>>>> <<"_local/c4cc070f896d7267e52ba012856fed4b">>, >>>>> {0,[<<"346185">>]}, >>>>> {[{<<"session_id">>, >>>>> <<"9fb3475683d44bb1e151031dd42cc59f">>}, >>>>> {<<"source_last_seq">>,1419004}, >>>>> {<<"replication_id_version">>,2}, >>>>> {<<"history">>, >>>>> [{[{<<"session_id">>, >>>>> >>>>> <<"9fb3475683d44bb1e151031dd42cc59f">>}, >>>>> {<<"start_time">>, >>>>> <<"Thu, 14 Jun 2012 01:35:02 GMT">>}, >>>>> {<<"end_time">>, >>>>> <<"Thu, 14 Jun 2012 23:15:29 GMT">>}, >>>>> {<<"start_last_seq">>,1410146}, >>>>> {<<"end_last_seq">>,1419004}, >>>>> {<<"recorded_seq">>,1419004}, >>>>> {<<"missing_checked">>,8100}, >>>>> {<<"missing_found">>,8100}, >>>>> {<<"docs_read">>,8100}, >>>>> {<<"docs_written">>,8100}, >>>>> {<<"doc_write_failures">>,0}]}, >>>>> {[{<<"session_id">>, >>>>> >>>>> <<"3edd7c50327eab7ec0768451e34efa8b">>}, >>>>> {<<"start_time">>, >>>>> <<"Tue, 12 Jun 2012 05:51:17 GMT">>}, >>>>> {<<"end_time">>, >>>>> <<"Tue, 12 Jun 2012 13:02:37 GMT">>}, >>>>> {<<"start_last_seq">>,1407186}, >>>>> {<<"end_last_seq">>,1410146}, >>>>> {<<"recorded_seq">>,1410146}, >>>>> {<<"missing_checked">>,2583}, >>>>> {<<"missing_found">>,2577}, >>>>> {<<"docs_read">>,2577}, >>>>> {<<"docs_written">>,2577}, >>>>> {<<"doc_write_failures">>,0}]}, >>>>> {[{<<"session_id">>, >>>>> >>>>> <<"172de62044281a01b1584a9d099f42af">>}, >>>>> {<<"start_time">>, >>>>> <<"Mon, 11 Jun 2012 03:40:11 GMT">>}, >>>>> {<<"end_time">>, >>>>> <<"Mon, 11 Jun 2012 15:16:24 GMT">>}, >>>>> {<<"start_last_seq">>,1405428}, >>>>> {<<"end_last_seq">>,1407186}, >>>>> {<<"recorded_seq">>,1407186}, >>>>> {<<"missing_checked">>,1721}, >>>>> {<<"missing_found">>,1721}, >>>>> {<<"docs_read">>,1721}, >>>>> {<<"docs_written">>,1721}, >>>>> {<<"doc_write_failures">>,0}]}, >>>>> {[{<<"session_id">>, >>>>> >>>>> <<"e60a126a2036c5fab00a1249101820c8">>}, >>>>> {<<"start_time">>, >>>>> <<"Sat, 09 Jun 2012 07:47:22 GMT">>}, >>>>> {<<"end_time">>, >>>>> <<"Sun, 10 Jun 2012 21:16:20 GMT">>}, >>>>> {<<"start_last_seq">>,1386289}, >>>>> {<<"end_last_seq">>,1405428}, >>>>> {<<"recorded_seq">>,1405428}, >>>>> {<<"missing_checked">>,16977}, >>>>> {<<"missing_found">>,16977}, >>>>> {<<"docs_read">>,16977}, >>>>> {<<"docs_written">>,16977}, >>>>> {<<"doc_write_failures">>,0}]}, >>>>> {[{<<"session_id">>, >>>>> >>>>> <<"ef3e4333d340dcf73ddfa3fe8c720042">>}, >>>>> {<<"start_time">>, >>>>> <<"Mon, 04 Jun 2012 02:39:44 GMT">>}, >>>>> {<<"end_time">>, >>>>> <<"Mon, 04 Jun 2012 12:35:50 GMT">>}, >>>>> {<<"start_last_seq">>,1384738}, >>>>> {<<"end_last_seq">>,1386289}, >>>>> {<<"recorded_seq">>,1386289}, >>>>> {<<"missing_checked">>,1551}, >>>>> {<<"missing_found">>,1550}, >>>>> {<<"docs_read">>,1550}, >>>>> {<<"docs_written">>,1550}, >>>>> {<<"doc_write_failures">>,0}]}, >>>>> {[{<<"session_id">>, >>>>> >>>>> <<"d5123a3caf462794aaf5a47be1bb3b6e">>}, >>>>> {<<"start_time">>, >>>>> <<"Wed, 30 May 2012 20:41:43 GMT">>}, >>>>> {<<"end_time">>, >>>>> <<"Mon, 04 Jun 2012 02:37:33 GMT">>}, >>>>> {<<"start_last_seq">>,1372404}, >>>>> {<<"end_last_seq">>,1384738}, >>>>> {<<"recorded_seq">>,1384738}, >>>>> {<<"missing_checked">>,12334}, >>>>> {<<"missing_found">>,12333}, >>>>> {<<"docs_read">>,12333}, >>>>> {<<"docs_written">>,12333}, >>>>> {<<"doc_write_failures">>,0}]}, >>>>> {[{<<"session_id">>, >>>>> >>>>> <<"52a16e8832f70dc094f6fff5e9b7d75b">>}, >>>>> {<<"start_time">>, >>>>> <<"Sun, 27 May 2012 23:36:41 GMT">>}, >>>>> {<<"end_time">>, >>>>> <<"Wed, 30 May 2012 20:40:14 GMT">>}, >>>>> {<<"start_last_seq">>,1361049}, >>>>> {<<"end_last_seq">>,1372404}, >>>>> {<<"recorded_seq">>,1372404}, >>>>> {<<"missing_checked">>,11355}, >>>>> {<<"missing_found">>,11355}, >>>>> {<<"docs_read">>,11355}, >>>>> {<<"docs_written">>,11355}, >>>>> {<<"doc_write_failures">>,0}]}, >>>>> [...lots of these...] >>>>> >>>>> [],false,[]}, >>>>> #Ref<0.0.15.159973>}], >>>>> false,false} >>>>> ** When Server state == >>>>> {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>, >>>>> <0.290.0>,<0.286.0>,<0.367.0>, >>>>> {db_header,6,992456,0, >>>>> {943280145,{744250,975,647546641},60017672}, >>>>> {943282327,745225,42485979}, >>>>> {943267963,[],5753}, >>>>> 0,nil,nil,1000}, >>>>> 992456, >>>>> {btree,<0.286.0>, >>>>> {943280145,{744250,975,647546641},60017672}, >>>>> #Fun<couch_db_updater.10.57960608>, >>>>> #Fun<couch_db_updater.11.57960608>, >>>>> #Fun<couch_btree.5.133731799>, >>>>> #Fun<couch_db_updater.12.57960608>,snappy}, >>>>> {btree,<0.286.0>, >>>>> {943282327,745225,42485979}, >>>>> #Fun<couch_db_updater.13.57960608>, >>>>> #Fun<couch_db_updater.14.57960608>, >>>>> #Fun<couch_btree.5.133731799>, >>>>> #Fun<couch_db_updater.15.57960608>,snappy}, >>>>> {btree,<0.286.0>, >>>>> {943267963,[],5753}, >>>>> #Fun<couch_btree.3.133731799>, >>>>> #Fun<couch_btree.4.133731799>, >>>>> #Fun<couch_btree.5.133731799>,nil,snappy}, >>>>> 992456,<<"cbstats">>, >>>>> "/Volumes/terror/db/couchdb/cbstats.couch",[],[], >>>>> nil, >>>>> {user_ctx,null,[],undefined}, >>>>> nil,1000, >>>>> [before_header,after_header,on_file_open], >>>>> [{user_ctx, >>>>> {user_ctx,null,[<<"_admin">>],undefined}}], >>>>> snappy,nil,nil} >>>>> ** Reason for termination == >>>>> ** {timeout, >>>>> {gen_server,call, >>>>> [<0.288.0>, >>>>> {db_updated, >>>>> {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,<0.290.0>, >>>>> <0.286.0>,<0.367.0>, >>>>> {db_header,6,992456,0, >>>>> {943280145,{744250,975,647546641},60017672}, >>>>> {943282327,745225,42485979}, >>>>> {943267963,[],5753}, >>>>> 0,nil,nil,1000}, >>>>> 992456, >>>>> {btree,<0.286.0>, >>>>> {943280145,{744250,975,647546641},60017672}, >>>>> #Fun<couch_db_updater.10.57960608>, >>>>> #Fun<couch_db_updater.11.57960608>, >>>>> #Fun<couch_btree.5.133731799>, >>>>> #Fun<couch_db_updater.12.57960608>,snappy}, >>>>> {btree,<0.286.0>, >>>>> {943282327,745225,42485979}, >>>>> #Fun<couch_db_updater.13.57960608>, >>>>> #Fun<couch_db_updater.14.57960608>, >>>>> #Fun<couch_btree.5.133731799>, >>>>> #Fun<couch_db_updater.15.57960608>,snappy}, >>>>> {btree,<0.286.0>, >>>>> {943284347,[],5756}, >>>>> #Fun<couch_btree.3.133731799>, >>>>> #Fun<couch_btree.4.133731799>, >>>>> #Fun<couch_btree.5.133731799>,nil,snappy}, >>>>> 992456,<<"cbstats">>, >>>>> "/Volumes/terror/db/couchdb/cbstats.couch",[],[],nil, >>>>> {user_ctx,null,[],undefined}, >>>>> #Ref<0.0.15.160107>,1000, >>>>> [before_header,after_header,on_file_open], >>>>> [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}], >>>>> snappy,nil,nil}}]}} >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> dustin sallings >>>> >>>> -- >>>> dustin sallings >
