Hello everyone,
we are running a small cluster with 3 nodes and 25 osds per node. And Ceph
version 17.2.6.
Recently the active mds crashed and since then the new starting mds has
always been in the up:replay state. In the output of the command 'ceph tell
mds.cephfs:0 status' you can see that the journal is completely read in. As
soon as it's finished, the mds crashes and the next one starts reading the
journal.
At the moment I have the journal inspection running ('cephfs-journal-tool
--rank=cephfs:0 journal inspect').
Does anyone have any further suggestions on how I can get the cluster
running again as quickly as possible?
Best regards
Lars
[image: ariadne.ai Logo] Lars Köppel
Developer
Email: [email protected]
Phone: +49 6221 5993580 <+4962215993580>
ariadne.ai (Germany) GmbH
Häusserstraße 3, 69115 Heidelberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]