I found a misconfiguration in my ceph config dump:

mgr                                        advanced
 mgr/cephadm/migration_current          5

and changing it to 3 solved the issue and the orchestrator is back to
working properly.

That's something to do with the previous failed upgrade to Quincy, which
updated automatically.
If I understand correctly migration_current is somehow a safety feature in
the upgrade.
If you have more info, please let me know.

Regards,
Reza


On Mon, 29 Aug 2022 at 10:50, Reza Bakhshayeshi <[email protected]>
wrote:

> Hi
>
> I'm using the pacific version with cephadm. After a failed upgrade from
> 16.2.7 to 17.2.2, 2/3 MGR nodes stopped working (this is a known bug of
> upgrade) and the orchestrator also didn't respond to rollback services, so
> I had to remove the daemons and add the correct one manually by running
> this command:
>
> ceph orch daemon add mgr --placement=<my_host>
>
> As it was mentioned in some bugs I tried removing admin label and
> reapplying them as well.
>
> Now, that the cluster status is healthy, but the orchestrator still
> doesn't work properly when I'm going to add RGW node. Also I cannot upgrade
> to a newer version:
>
> ceph orch host add <name> <IP> rgw-swift
> ceph orch apply rgw swift --realm=<realm-name> --zone=<zone-name>
> --placement="label:rgw-swift" --port=<port-num>
>
> I can't see any error logs. It seems like just not responding anymore.
> I Also tried these commands:
>
> ceph orch pause/cancel/resume
> ceph orch module enable/disable
>
> Do you have any idea?
>
> Best,
> Reza
>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to