On Sun, Dec 29, 2024 at 01:28:42PM -0500, Suresh Veliveli wrote: > Another instance where replication is stuck and not recovering. > > # requesting: contextCSN > contextCSN: *20241229135907.725117Z#000000#000#000000* > aaa-prod-aws-10:2636 > # requesting: contextCSN > contextCSN:* 20241228185913.665451Z#000000#000#000000* > > *Log info:* > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: do_syncrep2: rid=650 > cookie=rid=650,csn=20241228185913.665451Z#000000#000#000000 > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: syncrepl_entry: rid=650 > LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) > csn=20241228185913.665451Z#000000#000#000000 tid 0x7f26ee5fd640 > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing > 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000 > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn: > removing 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000 > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing > 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000 > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn: > removing 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000 > > Nothing gets logged about replication after the above. > > Am I missing something?
Hi Suresh, anything in the provider logs around that time? All consumers messages will be tagged with a specific "conn=xxx op=yyy" which you can discover e.g. by looking for the cookie it sends at the beginning of the session. Couple of other questions: - is the TCP connection alive as far as the OS is concerned (I see in the thread you've confirmed TCP keepalive is enabled, correct?) - could you post the cn=monitor info for the consumer? The objectclass to look for is olmSyncReplInstance Thanks, -- Ondřej Kuzník Senior Software Engineer Symas Corporation http://www.symas.com Packaged, certified, and supported LDAP solutions powered by OpenLDAP
