Another instance:
Yes, TCP keepalive is enabled.

aaa-prod-aws-7:1636
# requesting: contextCSN
*contextCSN: 20250101065905.147164Z#000000#000#000000*

aaa-prod-aws-7:2636
# requesting: contextCSN
contextCSN: 20250102140005.217756Z#000000#000#000000


dn: cn=Consumer 147,cn=Database 1,cn=Databases,cn=Monitor
objectClass: olmSyncReplInstance
cn: Consumer 147

*Consumer logs:*

Jan  1 01:59:12 aaa-prod-aws-7 slapd[1161089]: do_syncrep2: rid=147
cookie=rid=147,csn=20250101065905.147164Z#000000#000#000000
Jan  1 01:59:12 aaa-prod-aws-7 slapd[1161089]: syncrepl_entry: rid=147
LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY)
csn=20250101065905.147164Z#000000#000#000000 tid 0x7f8e4a5fd640
Jan  1 01:59:12 aaa-prod-aws-7 slapd[1161089]: slap_queue_csn: queueing
0x7f8e40152360 20250101065905.147164Z#000000#000#000000
Jan  1 01:59:12 aaa-prod-aws-7 slapd[1161089]: slap_graduate_commit_csn:
removing 0x7f8e40152360 20250101065905.147164Z#000000#000#000000
Jan  1 01:59:12 aaa-prod-aws-7 slapd[1161089]: slap_queue_csn: queueing
0x7f8e4036c6c0 20250101065905.147164Z#000000#000#000000
Jan  1 01:59:12 aaa-prod-aws-7 slapd[1161089]: slap_graduate_commit_csn:
removing 0x7f8e4036c6c0 20250101065905.147164Z#000000#000#000000

(Nothing after the above is logged regarding replication)

*Master:*

Jan  1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1
syncprov_sendresp:
cookie=rid=147,csn=20250101065905.124585Z#000000#000#000000
Jan  1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1
syncprov_sendresp:
cookie=rid=147,csn=20250101065905.147164Z#000000#000#000000

(Nothing after the above for rid=147)

*After restarting Consumer:*

Jan  2 09:29:54 aaa-prod-aws-7 slapd[2750000]: slap_get_csn: conn=-1 op=0
generated new csn=20250102142954.929120Z#000000#000#000000 manage=
0
Jan  2 09:29:55 aaa-prod-aws-7 slapd[2750000]: do_syncrep1: rid=147
starting refresh (sending cookie=rid=147,csn=20250101065905.147164Z#0000
00#000#000000)
Jan  2 09:29:59 aaa-prod-aws-7 slapd[2750000]: syncrepl_entry: rid=147
LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) csn=(none) tid 0x7f3288dfd640
Jan  2 09:29:59 aaa-prod-aws-7 slapd[2750000]: syncrepl_entry: rid=147
LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) csn=(none) tid 0x7f3288dfd640

Thanks,
Suresh

On Thu, Jan 2, 2025 at 6:34 AM Ondřej Kuzník <[email protected]> wrote:

> On Sun, Dec 29, 2024 at 01:28:42PM -0500, Suresh Veliveli wrote:
> > Another instance where replication is stuck and not recovering.
> >
> > # requesting: contextCSN
> > contextCSN: *20241229135907.725117Z#000000#000#000000*
> > aaa-prod-aws-10:2636
> > # requesting: contextCSN
> > contextCSN:* 20241228185913.665451Z#000000#000#000000*
> >
> > *Log info:*
> > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: do_syncrep2: rid=650
> > cookie=rid=650,csn=20241228185913.665451Z#000000#000#000000
> > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: syncrepl_entry: rid=650
> > LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY)
> > csn=20241228185913.665451Z#000000#000#000000 tid 0x7f26ee5fd640
> > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing
> > 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000
> > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn:
> > removing 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000
> > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing
> > 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000
> > Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn:
> > removing 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000
> >
> > Nothing gets logged about replication after the above.
> >
> > Am I missing something?
>
> Hi Suresh,
> anything in the provider logs around that time? All consumers messages
> will be tagged with a specific "conn=xxx op=yyy" which you can discover
> e.g. by looking for the cookie it sends at the beginning of the session.
>
> Couple of other questions:
> - is the TCP connection alive as far as the OS is concerned (I see in
>   the thread you've confirmed TCP keepalive is enabled, correct?)
> - could you post the cn=monitor info for the consumer? The objectclass
>   to look for is olmSyncReplInstance
>
> Thanks,
>
> --
> Ondřej Kuzník
> Senior Software Engineer
> Symas Corporation                       http://www.symas.com
> Packaged, certified, and supported LDAP solutions powered by OpenLDAP
>


-- 
Suresh Veliveli
Sr. UNIX Systems Engineer
Georgetown University
University Information Services | Security Infrastructure and
Policy-Identity and Collaboration
202-262-6676 (cell) | 202-687-3108 (work)

Reply via email to