[ceph-users] Re: Ceph All-SSD Cluster & Wal/DB Separation

Anthony D'Atri Mon, 02 Jan 2023 08:06:29 -0800

 Sent prematurely.

I meant to add that after ~3 years of service, the 1 DWPD drives in the 
clusters I mentioned mostly reported <10% of endurance burned.


Required endurance is in part a function of how long you expect the drives to 
last.

>> Having said that, for a storage cluster where write performance is expected 
>> to be the main bottleneck, I would be hesitant to use drives that only have 
>> 1DWPD endurance since Ceph has fairly high write amplification factors. If 
>> you use 3-fold replication, this cluster might only be able to handle a few 
>> TB of writes per day without wearing out the drives prematurely.
> 
>> 
>>> Hi Experts,I am seeking for if there is achievable significant write 
>>> performance improvements when separating WAL/DB in a ceph cluster with all 
>>> SSD type OSD.I have a cluster with 40 SSD (PM1643 1.8 TB SSD Enterprise 
>>> Samsung). I have 10 Storage node each with 4 OSD. I want to know that can I 
>>> get better write IOPs and throughput if I add one NVMe OSD per node and 
>>> separate WAL/DB on it?Is the result of this separation, meaningful 
>>> performance improvement or not?
>>> My ceph cluster is block storage back-end of Openstack cinder in a public 
>>> cloud service.
> 
> 
> My zwei pfennig:
> 
> * IMHO the performance delta with external WAL+DB is going to be limited.  
> NVNe WAL+DB would deliver lower write latency up to a point, but throughput 
> is still going to be limited by the SAS HBA / bulk OSD drives.  You also have 
> the hassle of managing OSDs that span devices: when replacing a failed OSD 
> properly handling the shared device can be tricky.  With your very small 
> number of nodes and drives, the blast radius of one failing would be really 
> large.
> 
> * Do you have the libvirt / librbd client-side cache disabled?
> 
> * I’ve run 3R clusters in a similar role, backing libvirt / librbd clients 
> and using SATA SSDs.  We mostly were able to sustain an average write latency 
> <= 5ms, though a couple of times we had to expand a cluster for IOPs before 
> capacity.  The crappy HBAs in use were part of the bottleneck.  This sort of 
> thing is one of the inputs to the SNIA TCO calculator.
> 
> 

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Ceph All-SSD Cluster & Wal/DB Separation

Reply via email to