Hi,

I'm just throwing it here as a warning to other users.
Kioxia KCD61LUL7T68 NVMe device have terrible discard performance.
(All devices report firmware 8001)

I have multiple of these kioxia SSDs in a ceph cluster. I normally configured bdev_async_discard_threads and bdev_enable_discard on all SSD.

I have a script which makes new rbd snapshots and clears old ones. It has plenty of sleep commands to make whole process easier on the cluster, but these SSDs just cannot handle it.

Example iostat:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          45.6%    0.0%   27.0%   10.4%    0.0%   17.0%

     r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz Device
 1482.00     55.9M   208.00  12.3%    0.22    38.6k nvme1n1
   13.00    220.0k     0.00   0.0%    0.15    16.9k nvme2n1

     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz Device
 7333.00     92.6M 15303.00  67.6%    0.15    12.9k nvme1n1
 1470.00     13.2M   294.00  16.7%    0.10     9.2k nvme2n1

     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz Device
  807.00      3.8M     0.00   0.0%    2.27     4.8k nvme1n1
  924.00      7.5M     0.00   0.0%    2.08     8.4k nvme2n1

NVMes start to choke around 700-800 discard IOPS.
These discard ops sizes are probably terrible for the SSDs but afaik there is no way to fix this without just disabling discard on bluestore.

In this case SSDs were about half full.

I've used multiple brands of SSDs on ceph with discard enabled and I've never seen results this bad.

Best regards
Adam Prycki

Attachment: smime.p7s
Description: Kryptograficzna sygnatura S/MIME

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to