[ceph-users] Re: Ceph S3 Performance: HDDs for data + SSDs for metadata – Best practice?

Boris Wed, 04 Jun 2025 04:52:19 -0700

So this is how we are doing it and it is working very well:

- Put the index and metadata pools on fast NVMEs. The usage of our flash
disks is at 1% disk space.
- Only have the data pool on spinners
- For every 5 HDDs we have 1 NVME that is serving the OSD metadata.
- After the reef upgrade we enable autosharding, because we run a multisite
setup
- Having a bucket with 503 (prime number) shards is not a problem and we
scaled one bucket from 11 shards to 401 without issue




Am Mi., 4. Juni 2025 um 12:43 Uhr schrieb <[email protected]>:

> Hi Michael,
>
> So challenges with Veeam backups are more around metadata.
>
> I do hope when you said enterprise SSD’s you meant NVME’s?
>
> By default Veeam will write everything into one bucket. The problem with
> this depending on the size of your environment is you will get into bucket
> sharding issues. You can also get issues with large OMAP objects.
>
> What happens is a bucket will get sharded when it gets to a certain number
> of objects. If you know how many objects you are likely to be storing then
> you can pre-shard a bucket for this larger number of objects to help
> prevent this issue.
>
> Newer versions of Veeam do allow for using multiple buckets at the backend
> which I would strongly suggest you use.
>
> The other large hit you get from Veeam is if you are using Object lock
> then you get lots of put object retention metadata changes which can
> generate a significant amount of metadata traffic.
>
> I would strongly suggest you put in some additional NVME’s to allow you to
> move the S3 metadata onto these devices to allow for better performance.
>
> Darren
>
> > On 4 Jun 2025, at 08:28, Ml Ml <[email protected]> wrote:
> >
> > Hi everyone,
> >
> > I'm currently planning a Ceph-based S3 storage backend, with a strong
> > focus on balancing performance and cost (EUR per TB).
> > Fully SSD-based setups seem too expensive for my use case, while
> > HDD-only configurations might be too slow.
> >
> > My current idea is to use a hybrid approach:
> > - HDDs for bluestore_block (bulk data)
> > - Enterprise SSDs for bluestore_block.db and bluestore_block.wal
> > (metadata and journal)
> >
> > The main client of this S3 backend will be Veeam, which, as far as I
> > can tell, uses larger block sizes.
> > Since this results in fewer random writes and mostly sequential I/O, I
> > assume this kind of workload can work well with HDDs — as long as
> > metadata is fast (hence SSDs for DB/WAL).
> >
> > Am I on the right track with this assumption?
> >
> > - Do you have any benchmark results (real-world or synthetic)
> > comparing HDD-only vs Hybrid setups for object workloads like Veeam?
> > That would help a lot to quantify the performance difference.
> > - What SSD-to-HDD ratio would you recommend? (I’ve seen 1 SSD per 3–5
> HDDs)
> >
> > Any advice or confirmation from those running similar setups would be
> > much appreciated!
> >
> > So far i found:
> >
> https://www.hyperscalers.com/image/catalog/00-Products/Storage%20Servers/White%20Paper%20-%20Performance%20Testing%20of%20Ceph%20with%20SD1Q-1ULH.pdf
> > => which mentions Ceph Verstion 0.94.5  and the performance boot does
> > not seem too big here
> >
> > And here:
> > https://www.ambedded.com.tw/en/use-case/use-case-08.html
> > they dont mention such a Hybrid Setup
> >
> > Thanks a lot in advance!
> >
> > Best regards,
> > Michael
> > _______________________________________________
> > ceph-users mailing list -- [email protected]
> > To unsubscribe send an email to [email protected]
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Ceph S3 Performance: HDDs for data + SSDs for metadata – Best practice?

Reply via email to