Hello again, I was just wondering if there was an update on this thread?
Since it is just one machine having an issue, do you (Rob and Rich) think a re-initialization from the master on the affected host would clear the clog? I have left it alone since Mark was brought into the discussion. Thank you! John DeSantis 2014-10-23 9:34 GMT-04:00 Rich Megginson <[email protected]>: > On 10/23/2014 07:01 AM, John Desantis wrote: >> >> Rob and Rich, >> >>>> ipa-replica-manage del should have cleaned things up. You can clear out >>>> old RUVs with ipa-replica-manage too via list-ruv and clean-ruv. You use >>>> list-ruv to get the id# to clean and clean-ruv to do the actual >>>> cleaning. >>> >>> I remember having previously tried this task, but it had failed on >>> older RUV's which were not even active (the KDC was under some strain >>> so ipa queries were timing out). However, I ran it again and have >>> been able to delete all but 1 (it's still running) RUV referencing the >>> previous replica. >>> >>> I'll report back once the tasks finishes or fails. >> >> The last RUV is "stuck" on another replica. It fails with the following >> error: >> >> [23/Oct/2014:08:55:09 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Initiating CleanAllRUV Task... >> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Retrieving maxcsn... >> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Found maxcsn (5447f861000000180000) >> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Cleaning rid (24)... >> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Waiting to process all the updates from the deleted replica... >> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Waiting for all the replicas to be online... >> [23/Oct/2014:08:55:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Waiting for all the replicas to receive all the deleted replica >> updates... >> [23/Oct/2014:08:55:11 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Replica maxcsn (5447f56b000200180000) is not caught up with deleted >> replica's maxcsn(5447f861000000180000) >> [23/Oct/2014:08:55:11 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Replica not caught up (agmt="cn=meToiparepbackup.our.personal.domain" >> (iparepbackup:389)) >> [23/Oct/2014:08:55:11 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Not all replicas caught up, retrying in 10 seconds >> [23/Oct/2014:08:55:23 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Replica maxcsn (5447f56b000200180000) is not caught up with deleted >> replica's maxcsn(5447f861000000180000) >> [23/Oct/2014:08:55:23 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Replica not caught up (agmt="cn=meToiparepbackup.our.personal.domain" >> (iparepbackup:389)) >> [23/Oct/2014:08:55:23 -0400] NSMMReplicationPlugin - CleanAllRUV Task: >> Not all replicas caught up, retrying in 20 seconds >> >> I then abort the task since the retrying went up to 14400 seconds. > > > Mark, do you know what is going on here? > > >> >> Would this be a simple re-initialization from the master on the host >> "iparepbackup"? >> >> Thanks, >> John DeSantis >> >> 2014-10-22 16:03 GMT-04:00 John Desantis <[email protected]>: >>> >>> Rob and Rich, >>> >>>> ipa-replica-manage del should have cleaned things up. You can clear out >>>> old RUVs with ipa-replica-manage too via list-ruv and clean-ruv. You use >>>> list-ruv to get the id# to clean and clean-ruv to do the actual >>>> cleaning. >>> >>> I remember having previously tried this task, but it had failed on >>> older RUV's which were not even active (the KDC was under some strain >>> so ipa queries were timing out). However, I ran it again and have >>> been able to delete all but 1 (it's still running) RUV referencing the >>> previous replica. >>> >>> I'll report back once the tasks finishes or fails. >>> >>> Thanks, >>> John DeSantis >>> >>> >>> 2014-10-22 15:49 GMT-04:00 Rob Crittenden <[email protected]>: >>>> >>>> Rich Megginson wrote: >>>>> >>>>> On 10/22/2014 12:55 PM, John Desantis wrote: >>>>>> >>>>>> Richard, >>>>>> >>>>>>> You should remove the unused ruv elements. I'm not sure why they >>>>>>> were not >>>>>>> cleaned. You may have to use cleanallruv manually. >>>>>>> >>>>>>> https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Managing_Replication-Solving_Common_Replication_Conflicts.html#cleanruv >>>>>>> >>>>>>> >>>>>>> note - use the cleanallruv procedure, not cleanruv. >>>>>> >>>>>> I'll try that, thanks for the guidance. >>>>>> >>>>>>> What is the real problem you have? Did replication stop working? Are >>>>>>> you >>>>>>> getting error messages? >>>>>> >>>>>> I cannot get the host to be a replica. Each time I run >>>>>> `ipa-replica-install >>>>>> replica-info-host-in-question.our.personal.domain.gpg' it fails. I >>>>>> had assumed it was due to the fact that the host was already a >>>>>> replica, but had to be taken offline due to a hard disk failing. The >>>>>> machine was re-provisioned after the new hard drive was installed. >>>>> >>>>> Ok. I don't know if we have a documented procedure for that case. I >>>>> assumed that if you first ran ipa-replica-manage del, then >>>>> ipa-replica-prepare, then ipa-replica-install, that would take care of >>>>> that. >>>> >>>> ipa-replica-manage del should have cleaned things up. You can clear out >>>> old RUVs with ipa-replica-manage too via list-ruv and clean-ruv. You use >>>> list-ruv to get the id# to clean and clean-ruv to do the actual >>>> cleaning. >>>> >>>>>> When I enabled extra debugging during the installation process, the >>>>>> initial error was that the dirsrv instance couldn't be started. I >>>>>> checked into this and found that there were missing files in >>>>>> /etc/dirsrv/slapd-BLAH directory. I was then able to start dirsrv >>>>>> after copying some schema files from another replica. The install did >>>>>> move forward but then failed with Apache and its IPA configuration. >>>>>> >>>>>> I performed several uninstalls and re-installs, and at one point I got >>>>>> error code 3 from ipa-replica-install, which is why I was thinking >>>>>> that the old RUV's and tombstones were to blame. >>>>> >>>>> It could be. I'm really not sure what the problem is at this point. >>>> >>>> I think we'd need to see ipareplica-install.log to know for sure. It >>>> could be the sort of thing where it fails early but doesn't kill the >>>> install, so the last error is a red herring. >>>> >>>> rob >>>> >>>>>> Thanks, >>>>>> John DeSantis >>>>>> >>>>>> >>>>>> 2014-10-22 12:51 GMT-04:00 Rich Megginson <[email protected]>: >>>>>>> >>>>>>> On 10/22/2014 10:31 AM, John Desantis wrote: >>>>>>>> >>>>>>>> Richard, >>>>>>>> >>>>>>>> You helped me before in #freeipa, so I appreciate the assistance >>>>>>>> again. >>>>>>>> >>>>>>>>> What version of 389 are you using? >>>>>>>>> rpm -q 389-ds-base >>>>>>>> >>>>>>>> 389-ds-base-1.2.11.15-34.el6_5 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> John DeSantis >>>>>>>> >>>>>>>> 2014-10-22 12:09 GMT-04:00 Rich Megginson <[email protected]>: >>>>>>>>> >>>>>>>>> On 10/22/2014 09:42 AM, John Desantis wrote: >>>>>>>>>> >>>>>>>>>> Hello all, >>>>>>>>>> >>>>>>>>>> First and foremost, a big "thank you!" to the FreeIPA developers >>>>>>>>>> for a >>>>>>>>>> great product! >>>>>>>>>> >>>>>>>>>> Now, to the point! >>>>>>>>>> >>>>>>>>>> We're trying to re-provision a previous replica using the standard >>>>>>>>>> documentation via the Red Hat site: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Identity_Management_Guide/Setting_up_IPA_Replicas.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> However, we're running into errors during the import process. The >>>>>>>>>> errors are varied and fail at random steps; there was an issue >>>>>>>>>> with >>>>>>>>>> NTP or HTTP or LDAP, etc. This did not happen when we promoted a >>>>>>>>>> separate node to become a replica. >>>>>>>>>> >>>>>>>>>> We had previously removed the replica via `ipa-replica-manage del` >>>>>>>>>> and >>>>>>>>>> ensured that no trace of it being a replica existed: removed DNS >>>>>>>>>> records and verified that the host enrollment was not present. I >>>>>>>>>> did >>>>>>>>>> not use the "--force" and "--cleanup" options. >>>>>>>>> >>>>>>>>> What version of 389 are you using? >>>>>>>>> rpm -q 389-ds-base >>>>>>> >>>>>>> You should remove the unused ruv elements. I'm not sure why they >>>>>>> were not >>>>>>> cleaned. You may have to use cleanallruv manually. >>>>>>> >>>>>>> https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Managing_Replication-Solving_Common_Replication_Conflicts.html#cleanruv >>>>>>> >>>>>>> >>>>>>> note - use the cleanallruv procedure, not cleanruv. >>>>>>> >>>>>>>>>> When I check RUV's against the host in question, there are >>>>>>>>>> several. I >>>>>>>>>> also queried the tombstones against the host and found two entries >>>>>>>>>> which have valid hex time stamps; coincidentally, out of the 9 >>>>>>>>>> tombstone entries, 2 have "nsds50ruv" time stamps. I'll paste >>>>>>>>>> sanitized output below: >>>>>>>>>> >>>>>>>>>> # ldapsaerch -x -W -LLL -D "cn=directory manager" -b >>>>>>>>>> "dc=our,dc=personal,dc=domain" '(objectclass=nsTombstone)' >>>>>>>>>> Enter LDAP Password: >>>>>>>>>> dn: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=our,dc=personal,dc=domain >>>>>>>>>> >>>>>>>>>> objectClass: top >>>>>>>>>> objectClass: nsTombstone >>>>>>>>>> objectClass: extensibleobject >>>>>>>>>> nsds50ruv: {replicageneration} 50ef13ae000000040000 >>>>>>>>>> nsds50ruv: {replica 4 ldap://master.our.personal.domain:389} >>>>>>>>>> 5164d147000000040000 5447bda 8000100040000 >>>>>>>>>> nsds50ruv: {replica 22 >>>>>>>>>> ldap://separatenode.our.personal.domain:389} >>>>>>>>>> 54107f9f000000160000 54436b 25000000160000 >>>>>>>>>> nsds50ruv: {replica 21 >>>>>>>>>> ldap://iparepbackup.our.personal.domain:389} >>>>>>>>>> 51b734de000000150000 51b7 34ef000200150000 >>>>>>>>>> nsds50ruv: {replica 19 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} >>>>>>>>>> 510d56c9000100130000 >>>>>>>>>> 510d82 be000200130000 >>>>>>>>>> nsds50ruv: {replica 18 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} >>>>>>>>>> nsds50ruv: {replica 17 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} >>>>>>>>>> nsds50ruv: {replica 16 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} >>>>>>>>>> nsds50ruv: {replica 15 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} >>>>>>>>>> nsds50ruv: {replica 14 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} >>>>>>>>>> nsds50ruv: {replica 13 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} >>>>>>>>>> nsds50ruv: {replica 12 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} >>>>>>>>>> nsds50ruv: {replica 23 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} >>>>>>>>>> 54187702000200170000 >>>>>>>>>> 541878 9a000000170000 >>>>>>>>>> dc: our >>>>>>>>>> nsruvReplicaLastModified: {replica 4 >>>>>>>>>> ldap://master.our.personal.domain:389} 5447bce8 >>>>>>>>>> nsruvReplicaLastModified: {replica 22 >>>>>>>>>> ldap://separatenode.our.personal.domain:389} 54436a5e >>>>>>>>>> nsruvReplicaLastModified: {replica 21 >>>>>>>>>> ldap://iparepbackup.our.personal.domain:389} 00000000 >>>>>>>>>> nsruvReplicaLastModified: {replica 19 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000 >>>>>>>>>> nsruvReplicaLastModified: {replica 18 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000 >>>>>>>>>> nsruvReplicaLastModified: {replica 17 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000 >>>>>>>>>> nsruvReplicaLastModified: {replica 16 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000 >>>>>>>>>> nsruvReplicaLastModified: {replica 15 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000 >>>>>>>>>> nsruvReplicaLastModified: {replica 14 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000 >>>>>>>>>> nsruvReplicaLastModified: {replica 13 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000 >>>>>>>>>> nsruvReplicaLastModified: {replica 12 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000 >>>>>>>>>> nsruvReplicaLastModified: {replica 23 >>>>>>>>>> ldap://host-in-question.our.personal.domain:389} 00000000 >>>>>>>>>> >>>>>>>>>> dn: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> nsuniqueid=c08a2803-5b5a11e2-a527ce8b-8fa47d35,cn=host-in-question.our.personal.domain,cn=maste >>>>>>>>>> >>>>>>>>>> rs,cn=ipa,cn=etc,dc=our,dc=personal,dc=domain >>>>>>>>>> objectClass: top >>>>>>>>>> objectClass: nsContainer >>>>>>>>>> objectClass: nsTombstone >>>>>>>>>> cn: host-in-question.our.personal.domain >>>>>>>>>> nsParentUniqueId: e6fa9418-5b5711e2-a1a9825b-daf5b5b0 >>>>>>>>>> >>>>>>>>>> dn: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> nsuniqueid=664c4383-6d6311e2-8db6e946-de27dd8d,cn=host-in-question.our.personal.domain,cn=maste >>>>>>>>>> >>>>>>>>>> rs,cn=ipa,cn=etc,dc=our,dc=personal,dc=domain >>>>>>>>>> objectClass: top >>>>>>>>>> objectClass: nsContainer >>>>>>>>>> objectClass: nsTombstone >>>>>>>>>> cn: host-in-question.our.personal.domain >>>>>>>>>> nsParentUniqueId: e6fa9418-5b5711e2-a1a9825b-daf5b5b0 >>>>>>>>>> >>>>>>>>>> As you can see, the "host-in-question" has many RUV's and of which >>>>>>>>>> two >>>>>>>>>> appear to be "active" and two entries which I believe (pardon my >>>>>>>>>> ignorance) possibly correlate with the "active" entries of the >>>>>>>>>> "host-in-question". >>>>>>>>>> >>>>>>>>>> Do these two tombstone entries need to be deleted with ldapdelete >>>>>>>>>> before we can re-provision "host-in-question" and add it back as a >>>>>>>>>> replica? >>>>>>> >>>>>>> No, you cannot delete tombstones manually. They will be cleaned up >>>>>>> at some >>>>>>> point by the dirsrv tombstone reap thread, and they should not be >>>>>>> interfering with anything. >>>>>>> >>>>>>> What is the real problem you have? Did replication stop working? Are >>>>>>> you >>>>>>> getting error messages? >>>>>>> >>>>>>>>>> Thank you, >>>>>>>>>> John DeSantis >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> Manage your subscription for the Freeipa-users mailing list: >>>>>>>>> https://www.redhat.com/mailman/listinfo/freeipa-users >>>>>>>>> Go To http://freeipa.org for more info on the project >>>>>>> >>>>>>> -- >>>>>>> Manage your subscription for the Freeipa-users mailing list: >>>>>>> https://www.redhat.com/mailman/listinfo/freeipa-users >>>>>>> Go To http://freeipa.org for more info on the project > > -- Manage your subscription for the Freeipa-users mailing list: https://www.redhat.com/mailman/listinfo/freeipa-users Go To http://freeipa.org for more info on the project
