On 6/2/22 1:38 PM, Rob Crittenden wrote:
Kathy Zhu via FreeIPA-users wrote:
Hi Team,

We upgraded our Centos 7 IPA masters to the latest:

CentOS Linux release 7.9.2009 (Core)

*ipa*-server.x86_64                      4.6.8-5.el7.centos.10

*389-ds*-base.x86_64                     1.3.10.2-15.el7_9

*389-ds*-base-libs.x86_64                1.3.10.2-15.el7_9

*389-ds*-base-snmp.x86_64                1.3.10.2-15.el7_9

*slapi*-nis.x86_64                       0.56.5-3.el7_9


After that, 8 of 10 masters had replication issues. After
reinitializing, 2 of them are still having issues. They can accept
replication from other masters but their own changes can not be
replicated to others.


Here are the logs in /var/log/dirsrv/slapd-EXAMPLE-COM/errors:


[01/Jun/2022:21:53:02.324756398 -0700] - ERR - NSMMReplicationPlugin -
send_updates - agmt="cn=dc1-ipa1.example.com-to-dc2-ipa1.example.com
<http://dc1-ipa1.example.com-to-dc2-ipa1.example.com>" (dc2-ipa1:389):
Data required to update replica has been purged from the changelog. If
the error persists the replica must be reinitialized.

[01/Jun/2022:21:53:03.396330801 -0700] - ERR -
agmt="cn=dc1-ipa1.example.com-to-dc3-ipa1.example.com
<http://dc1-ipa1.example.com-to-dc3-ipa1.example.com>" (dc3-ipa1:389) -
clcache_load_buffer - Can't locate CSN 627e26a50005001d0000 in the
changelog (DB rc=-30988). If replication stops, the consumer may need to
be reinitialized.

[01/Jun/2022:21:53:03.396502102 -0700] - ERR - NSMMReplicationPlugin -
changelog program - repl_plugin_name_cl -
agmt="cn=dc1-ipa1.example.com-to-dc3-ipa1.example.com
<http://dc1-ipa1.example.com-to-dc3-ipa1.example.com>" (dc3-ipa1:389):
CSN 627e26a50005001d0000 not found, we aren't as up to date, or we purged

[01/Jun/2022:21:53:03.396694568 -0700] - ERR - NSMMReplicationPlugin -
send_updates - agmt="cn=dc1-ipa1.example.com-to-dc3-ipa1.example.com
<http://dc1-ipa1.example.com-to-dc3-ipa1.example.com>" (dc3-ipa1:389):
Data required to update replica has been purged from the changelog. If
the error persists the replica must be reinitialized.

[01/Jun/2022:21:53:04.411599251 -0700] - ERR -
agmt="cn=dc1-ipa1.example.com-to-ipa0.example.com
<http://dc1-ipa1.example.com-to-ipa0.example.com>" (ipa0:389) -
clcache_load_buffer - Can't locate CSN 627e26a50005001d0000 in the
changelog (DB rc=-30988). If replication stops, the consumer may need to
be reinitialized.

[01/Jun/2022:21:53:04.411753186 -0700] - ERR - NSMMReplicationPlugin -
changelog program - repl_plugin_name_cl -
agmt="cn=dc1-ipa1.example.com-to-ipa0.example.com
<http://dc1-ipa1.example.com-to-ipa0.example.com>" (ipa0:389): CSN
627e26a50005001d0000 not found, we aren't as up to date, or we purged

[01/Jun/2022:21:53:04.411893312 -0700] - ERR - NSMMReplicationPlugin -
send_updates - agmt="cn=dc1-ipa1.example.com-to-ipa0.example.com
<http://dc1-ipa1.example.com-to-ipa0.example.com>" (ipa0:389): Data
required to update replica has been purged from the changelog. If the
error persists the replica must be reinitialized.

[01/Jun/2022:21:53:05.482898290 -0700] - ERR -
agmt="cn=dc1-ipa1.example.com-to-dc2-ipa1.example.com
<http://dc1-ipa1.example.com-to-dc2-ipa1.example.com>" (dc2-ipa1:389) -
clcache_load_buffer - Can't locate CSN 627e26a50005001d0000 in the
changelog (DB rc=-30988). If replication stops, the consumer may need to
be reinitialized.

[01/Jun/2022:21:53:05.483231727 -0700] - ERR - NSMMReplicationPlugin -
changelog program - repl_plugin_name_cl -
agmt="cn=dc1-ipa1.example.com-to-dc2-ipa1.example.com
<http://dc1-ipa1.example.com-to-dc2-ipa1.example.com>" (dc2-ipa1:389):
CSN 627e26a50005001d0000 not found, we aren't as up to date, or we purged

[01/Jun/2022:21:53:05.483483005 -0700] - ERR - NSMMReplicationPlugin -
send_updates - agmt="cn=dc1-ipa1.example.com-to-dc2-ipa1.example.com
<http://dc1-ipa1.example.com-to-dc2-ipa1.example.com>" (dc2-ipa1:389):
Data required to update replica has been purged from the changelog. If
the error persists the replica must be reinitialized.


Note, those messages are after being reinitialized.


Any idea what's wrong here?
I'm not sure, cc'ing one of the 389ds developers.

Mark, any ideas?

Looks like https://github.com/389ds/389-ds-base/issues/5098

This was fixed fairly recently, and it has not been added to any RHEL 7.9 builds yet.

Pierre worked on this, but I think the issue happens when the LDIF used to init replication has the RUV entry as the first entry in the LDIF file.  If it is moved to the end of the file I think it will fix the init issue.  So don't do an online init.  Export the database with replication data (-r) from a good supplier, edit the ldif file and make sure the RUV tombstone entry is the last entry in the file, then import it on the consumer.

HTH,
Mark


rob

--
Directory Server Development Team
_______________________________________________
FreeIPA-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedorahosted.org/archives/list/[email protected]
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Reply via email to