[ceph-users] Re: Ceph 14.2.22 MONs keep crashing in MDSMonitor::maybe_resize_cluster (out of range)

Eugen Block Wed, 18 Jun 2025 07:06:06 -0700

My main question now is "which is the 'latest' MON?"

Timestamp of the files within the mon db store. ;-) no need to digthrough the db itself.

If you don't feel confident to manage the cluster with DeepSea (I knowof people who were literally afraid of DeepSea stages :-D ), thendon't. :-) Without cephadm you can try to deploy daemons relativelyeasy. Two years ago I wrote an article [1] how to migrate from SES 6(Nautilus) to Upstream (Pacific) after testing this procedure for apotential customer. You can reach out to me if you should get to thatpoint.

I'm still in favor of the single mon approach, it has worked manytimes for us (actually it always worked), and it's relatively quickand easy to test. If that shouldn't work, there's still the procedureto collect the maps from the OSDs to create a new mon store. But let'ssee how far you get before exploring this option.

[1]https://heiterbiswolkig.blogs.nde.ag/2023/08/14/how-to-migrate-from-suse-enterprise-storage-to-upstream-ceph/


Zitat von Miles Goodhew <[email protected]>:

On Wed, 18 Jun 2025, at 18:09, Eugen Block wrote:
That does look strange indeed, either an upgrade went wrong or someone
already fiddled with the monmap, I'd say. But anyway, I wouldn't try
to deploy a 4th mon since it would want to sync the store, but we
don't know in which state the store actually is. And besides from
that, 2 out of 4 MONs still isn't a quorum, so there's no real
benefit. So my best bet would be on the mon with the most recent
store. And if the cluster comes back up with one mon, you'll need to
wipe the traces of the previous mons so DeepSea can redeploy
additional mons cleanly. Or is the cluster not managed by DeepSea
anymore?
Replies to fragments from above are below:
either an upgrade went wrong or someone already fiddled with the monmap
That's entirely possible. I'm playing the role of a "guy who knows abit about Ceph" to try and un-explode an old cluster on unsupportedOS and hardware. The original deployers are long-since gone and theday-to-day admins were never given much handover. There are legendsof several phases of upgrades and deployment system replacements,but concrete documentation is thin on the ground. Certainly Irecently found evidence of failed OS upgrades that broke part of theRGW services years ago.
I had previously documented a plan to migrate the cluster to anew/supported hardware, OS and Ceph version, but the client wasstill thinking about it when this happened.
wouldn't try to deploy a 4th mon
The idea for the 4th MON was to just see if I can deploy a new MONwithout breaking the cluster much more. However given I can't get >1MONs to start, it's pretty broken right now. If that deploymentworked, I intended then to remove/redeploy each of the other twoMONs before retiring the 4th MON again. A side-benefit of this isthat it lets me test some of my cluster upgrade plan. One of thecluster clients is Openstack, which in my experience is pretty"sentimental" about its set of MON IPs.
mon with the most recent store
How would I find out which MON that is? I'm told mon3 was the lastone operating (but it gives the wall of "e6 handle_auth_requestfailed to assign global_id" logs when running). mon2 is the one thatsurvives if you try to start all of them. I've tried inspecting the(SQLite?) DBs, but can't get much comprehensible info out of themyet (I don't have any experience tinkering with SQLite, but I'm OKwith an "actual" SQL repl). I can't get quorum, so I can't run "ceph..." command lines, but I can calk to each of the MONs on their Unixsockets when they're running.
Or is the cluster not managed by DeepSea anymore?
I don't think it is. None of the admins (nor I) have very deepexperience in Salt stuff (I'm more Ansible). The aforementioned"legends" of the system's lifetime also say there were multipledifferent management systems over the years. I've mainly used theexisting Salt config to break-into managed nodes I didn't yet havean account for and to do fleet-wide "shell command" operations.Given the probability of historic broken OS upgrades and possiblyabandoned Salt management, I'd be wary of trying to use this fordeployment automation.
(Now in a later email)
Although I'm not a dev, I looked into the code [0] anyway.

The comments before the maybe_resize_cluster function say:

  * If a cluster is undersized (with respect to max_mds), then
  * attempt to find daemons to grow it. If the cluster is oversized
  * (with respect to max_mds) then shrink it by stopping its highest rank.

Is it possible that an operator/admin tried to resize (shrink or grow
the number of MDS daemons) the MDS culster? Or was a DeepSea stage
executed in order to deploy additional daemons? Maybe some history
could help understand what might have happened."
Yes, I saw all that too. I was told that this all started becauseone of the admins noticed that the CephFS service was slow and wasreporting laggy MDSes. This may well be a latent issue from possiblehistorical failed upgrade (pure guesses here). The admin triedrestarting some daemons and eventually only mon2 would run (I'm abit vague on the detail). I don't *think* they tried removing theMDS daemons, but it's possible (I'll check tomorrow).
One of my possible plans of attack was to see if that "mayberesize..." method might be skipped with some "No"-flag or otherconfig. Hopefully then to try and get quorum established beforere-enabling it and possibly coming back to health. This is probablytoo wishful a prospect, though.
Thanks again for all your feedback. Even if this just turns out tobe a massive "rubber ducking" session, you've given me some newideas and threads to pull. My main question now is "which is the'latest' MON?"
M0les.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]



_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Ceph 14.2.22 MONs keep crashing in MDSMonitor::maybe_resize_cluster (out of range)

Reply via email to