[ceph-users] Re: radosgw-admin: period update error after deleting and recreating zones

Michel Jouvin Wed, 21 May 2025 06:17:53 -0700

Hi,

An update on this issue. Thanks to suggestions from Frédéric Nass, Ithink I managed to clear the problem by deleting the realm and all itsobjects (zonegroup, zone, period) with radosgw-admin and deleting thepools associated with the deleted zone. I am sure it is not a generalsolution for this problem that I was able to reproduce on a testcluster. I've the feeling that radosgw-admin should make a better job toavoid creating such a mess when deleting zones but it is another story.The reasons why deleting the realm and its objects worked for us include:

- The realm/zonegroup/zone was just created and there was no usefulcontent in it so loosing everything related to it was an option as saidpreviously (but deleting .rgw.root was not an option as we have severalrealms in production).

- We configure each realm/zonegroup/zone with a separate set of RGW(that can be deployed on the same server by cephadm but it is anotherstory) so the only RGW impacts are those related to the deleted realm.

- Our realm was monosite. After deleting the realm, it is not possibleto push (commit) the change to other zonegroup/zones of the realm as therealm must exist to be able to commit a new period. I guess that in amultisite configuration, it means that the cleanup operation must bedone in all the clusters involved in the multisite configuration.


Best regards,

Michel

Le 14/05/2025 à 18:12, Michel Jouvin a écrit :

Hi,

We are still stucked with this problem and I have not seen an answerto my previous emails. We found in the doc the explanation of theproblem:https://docs.ceph.com/en/latest/radosgw/multisite/#deleting-a-zone.But the doc does not mention the way out of the problem... If wedelete the realm would it help? There is no content in thisrealm/zonegroup/zone so removing everything is an option if it helps.


Thanks in advance for any hint. Best regards,

Michel
Sent from my mobile

Le 7 mai 2025 16:49:19 Michel Jouvin <[email protected]> aécrit :

Hi,

I managed to find what where the zone and zonegroup ID before they were
deleted and I confirm that those referred into the error messages are
the ID of the deleted zone and zonegroup. The new zone and zonegroup
(which have the same name, again not sure if it is a problem as
everything should be done by ID, isn't it) have been defined as master
zone and zonegroup, so the other ones should just be deleted, isn't it?
I really don't understand what the error means and what can be done to
fix it.

Best regards,

Michel

Le 06/05/2025 à 21:29, Michel Jouvin a écrit :

Hi,

It is not the first time that after doing configuration changes in
RADOS for a realm/zonegroup/zone with radosgw-admin, we get errors
when trying to do a "period update --commit". We never found a good
documentation on how to fix these problems, up to now we always
managed at some point to restore a good configuration that can be

commited but it is probably time for us to have a more informedapproach!


Last occurence of the problem happened today with a
realm/zonegroup/zone created recently. Trying to fix a problem with
the non working haproxy associated with it, one of my colleagues
decided to delete and recreate the zone and zonegroup (with the same
names). The related commands worked but since it has been done any
attempt to do "period update --commit" results in the following error:

-------

2025-05-06T11:56:20.939+0200 7fdc7d41da80 0 failed reading obj info
from .rgw.root:zone_info.93af6e0c-4552-4c2e-b167-36114a5a81e4: (2) No
such file or directory
2025-05-06T11:56:20.945+0200 7fdc7d41da80 0 failed reading obj info
from .rgw.root:zonegroup_info.d7221099-4e7d-43cb-a1e8-28a750de1cd5:
(2) No such file or directory
2025-05-06T11:56:21.160+0200 7fdc7d41da80 0 failed reading obj info
from .rgw.root:zone_info.93af6e0c-4552-4c2e-b167-36114a5a81e4: (2) No
such file or directory
2025-05-06T11:56:21.160+0200 7fdc7d41da80 -1 Cannot find zone
id=93af6e0c-4552-4c2e-b167-36114a5a81e4 (name=default)
2025-05-06T11:56:21.160+0200 7fdc7d41da80 0 ERROR: failed to start
notify service ((22) Invalid argument
2025-05-06T11:56:21.160+0200 7fdc7d41da80 0 ERROR: failed to init
services (ret=(22) Invalid argument)
couldn't init storage provider
-------

I have the feeling that it is related to the delete objects that are
no longer found but it is not completely clear what is the way out of
it? Is the problem related to recreating the zone/zonegroup with the
same names? There are several realms already in production so we
cannot do a .rgw.root reset but this particular realm has never been
put in production so we can delete everything related to it.

Thanks in advance for any hint or pointer. Best regards,

Michel

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: radosgw-admin: period update error after deleting and recreating zones

Reply via email to