To close this thread, and give people a reference point, this is what I did
(and it worked)
I wound up not using mirror, I just used vgextend/pvmove/vgreduce to move the
data.
1. Verify the devices in question:
# lsblk
NAME
MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda
8:0 0 3.6T 0 disk
└─ceph--128bd83e--4bb5--466d--8f1c--1f00d93838f2-osd--block--fe7602d3--8e32--46e3--8dd6--4261e1b2feb6
253:2 0 3.6T 0 lvm
sdb
8:16 0 3.6T 0 disk
└─ceph--fef757df--55fc--41cb--aaa8--6f7675871c1f-osd--block--358ec8f2--e51b--419c--88bd--f088823c26a9
253:5 0 3.6T 0 lvm
sdc
8:32 0 3.6T 0 disk
└─ceph--92735252--d160--446b--b757--a2b2bf8e7ba7-osd--block--35c13cd7--bdd9--4c3b--a46d--6caaf7fc400a
253:3 0 3.6T 0 lvm
sdd
8:48 0 465.8G 0 disk
├─sdd1
8:49 0 600M 0 part /boot/efi
├─sdd2
8:50 0 1G 0 part /boot
└─sdd3
8:51 0 464.2G 0 part
├─rhel-root
253:0 0 288.4G 0 lvm
/var/lib/containers/storage/overlay
│
/
├─rhel-swap
253:1 0 11.8G 0 lvm
└─rhel-home
253:6 0 350G 0 lvm /home
sde
8:64 0 3.6T 0 disk
└─ceph--fd2da222--2dc7--4be0--89a6--27e485ab4700-osd--block--278cfa46--6c69--4d87--b0c3--27e0b97ce379
253:4 0 3.6T 0 lvm
sdf
8:80 0 1.8T 0 disk
├─sdf1
8:81 0 512G 0 part
├─sdf2
8:82 0 512G 0 part
├─sdf3
8:83 0 512G 0 part
└─sdf4
8:84 0 327G 0 part
nvme0n1
259:0 0 931.5G 0 disk
└─nvme0n1p1
259:1 0 931.5G 0 part
└─rhel-root
253:0 0 288.4G 0 lvm
/var/lib/containers/storage/overlay
/
SDC was the drive failing, so I knew the lvm to look at
2. create PV on new device
pvcreate /dev/sdg
3. Extend the volume group onto the new drive
# vgextend ceph-fd2da222-2dc7-4be0-89a6-27e485ab4700 /dev/sdg
Volume group "ceph-fd2da222-2dc7-4be0-89a6-27e485ab4700" successfully
extended4. move the data
4. Move the data ## this will take some time, go have a nice dinner, get some
sleep, binge watch a tv show,...
# pvmove /dev/sdc
/dev/sdc: Moved: 0.01%
/dev/sdc: Moved: 0.06%
.......
/dev/sdc: Moved: 99.98%
/dev/sdc: Moved: 100.00%
5. remove the dying drive
# vgreduce ceph-fd2da222-2dc7-4be0-89a6-27e485ab4700 /dev/sdc
Removed "/dev/sdc" from volume group
"ceph-fd2da222-2dc7-4be0-89a6-27e485ab4700"
6. (Very important) verify the serial number of the drive to remove:
# udevadm info --query=all --name=/dev/sdc | grep ID_SERIAL
E: ID_SERIAL=ST4000DX001-1CE168_Z303TB0R
E: ID_SERIAL_SHORT=Z303TB0R
7. put host into maintenance, then halt the system, and swap the physical
drives.
8. Start the host, take it out of maintenance, and let any sync happen -
watch ceph status until it is completely recovered - a few minutes.
Doing it this way, I did not notice any issues with ceph during the move - I
had things interacting with cephfs, and rbd the whole time without a glitch.
There are a few pgs marked as backfilling - I may have missed a delay with the
host maintenance. But its only about 20, which is much less than if I took the
entire OSD out and rebuilt it.
Thanks,
Rob
-----Original Message-----
From: Eugen Block <[email protected]>
Sent: Wednesday, January 3, 2024 2:37 PM
To: [email protected]
Subject: [ceph-users] Re: Best way to replace Data drive of OSD
Hi,
in such a setup I also prefer option 2, we've done this since lvm came into
play with OSDs, just not with cephadm yet. But we have a similar configuration
and one OSD starts to fail as well. I'm just waiting for the replacement drive
to arrive. ;-)
Regards,
Eugen
Zitat von "Robert W. Eckert" <[email protected]>:
> Hi - I have a drive that is starting to show errors, and was wondering
> what the best way to replace it is.
>
> I am on Ceph 18.2.1, and using cephadm/containers I have 3 hosts, and
> each host has 4 4Tb drives with a 2 tb NVME device splt amongst them
> for WAL/DB, and 10 GB Networking.
>
>
> Option 1: Stop the OSD, use dd to copy from old to new, remove old,
> reboot so LVM recognized new as the volume that old was.
> Option 2: LVM and mirror the old drive to the new, then remove the
> old, once the mirroring is complete. In this way, I don't have to
> remove and reprovision the OSD, and the OSD doesn't need to be down
> during any Option 3: Remove the OSD, let everything settle down, swap
> the drive, fight the orchestrator to get the OSD provisioned with the
> OSD and db partition on the proper partition of the NVME, then let
> everything sync up again.
>
> I am leaning towards Option 2, because it should have the least
> impact/overhead on the rest of the drives, but am open to the other
> options as well.
>
> Thanks,
> Rob
> _______________________________________________
> ceph-users mailing list -- [email protected] To unsubscribe send an
> email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected] To unsubscribe send an email to
[email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]