> Apparently those UUIDs aren't as reliable as I thought.
>
> I've had problems with a server box that hosts a ceph VM.
VM?
> Looks like the mobo disk controller is unreliable
Lemme guess, it is an IR / RoC / RAID type? As opposed to JBOB / IT?
If the former and it’s an LSI SKU as most are, I’d love if you could send me
privately the output of
storcli64 /c0 show termlog >/tmp/termlog.txt
Sometimes flakiness is actually with the drive backplane, especially when it
has an embedded expander. In either case, updating HBA firmware sometimes
makes a real difference.
And drive firmware.
> AND one of the disks passes SMART
I’m curious if it shows SATA downshifts.
> but has interface problems. So I moved the disks to an alternate box.
>
> Between relocation and dropping the one disk, neither of the 2 OSDs for that
> host will come up. If everything was running solely on static UUIDs, the good
> disk should have been findable even if its physical disk device name shifted.
> But it wasn't.
Did you try
ceph-volume lvm activate —all
?
> Which brings up something I've wondered about for some time. Shouldn't it be
> possible for OSDs to be portable?
I haven’t tried it much, but that *should* be true, modulo CRUSH location.
> That is, if a box goes bad, in theory I should be able to remove the drive
> and jack it into a hot-swap bay on another server and have that server able
> to import the relocated OSD.
I’ve effectively done a chassis swap, moving all the drives including the boot
volume, but that admittedly was in the ceph-disk days.
> True, the metadata for an OSD is currently located on its host, but it seems
> like it should be possible to carry a copy on the actual device.
My limited understanding is that *is* the case with LVM.
>
> Tim
>
> On 4/11/25 16:23, Anthony D'Atri wrote:
>> Filestore, pre-ceph-volume may have been entirely different. IIRC LVM is
>> used these days to exploit persistent metadata tags.
>>
>>> On Apr 11, 2025, at 4:03 PM, Tim Holloway <[email protected]> wrote:
>>>
>>> I just checked an OSD and the "block" entry is indeed linked to storage
>>> using a /dev/mapper uuid LV, not a /dev/device. When ceph builds an
>>> LV-based OSD, it creates a VG whose name is "ceph-uuuuu", where "uuuu" is a
>>> UUID, and an LV named "osd-block-vvvv", where "vvvv" is also a uuid. So
>>> although you'd map the osd to something like /dev/vdb in a VM, the actual
>>> name ceph uses is uuid-based (and lvm-based) and thus not subject to change
>>> with alterations in the hardware as the uuids are part of the metadata in
>>> VGs and LVs created by ceph.
>>>
>>> Since I got that from a VM, I can't vouch for all cases, but I thought it
>>> especially interesting that a ceph was creating LVM counterparts even for
>>> devices that were not themselves LVM-based.
>>>
>>> And yeah, I understand that it's the amount of OSD replicate data that
>>> counts more than the number of hosts, but when an entire host goes down and
>>> there are few hosts, that can take a large bite out of the replicas.
>>>
>>> Tim
>>>
>>> On 4/11/25 10:36, Anthony D'Atri wrote:
>>>> I thought those links were to the by-uuid paths for that reason?
>>>>
>>>>> On Apr 11, 2025, at 6:39 AM, Janne Johansson <[email protected]> wrote:
>>>>>
>>>>> Den fre 11 apr. 2025 kl 09:59 skrev Anthony D'Atri
>>>>> <[email protected]>:
>>>>>> Filestore IIRC used partitions, with cute hex GPT types for various
>>>>>> states and roles. Udev activation was sometimes problematic, and LVM
>>>>>> tags are more flexible and reliable than the prior approach. There no
>>>>>> doubt is more to it but that’s what I recall.
>>>>> Filestore used to have softlinks towards the journal device (if used)
>>>>> which pointed to sdX where that X of course would jump around if you
>>>>> changed the number of drives on the box, or the kernel disk detection
>>>>> order changed, breaking the OSD.
>>>>>
>>>>> --
>>>>> May the most significant bit of your life be positive.
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- [email protected]
>>>>> To unsubscribe send an email to [email protected]
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]