So this is how we are doing it and it is working very well: - Put the index and metadata pools on fast NVMEs. The usage of our flash disks is at 1% disk space. - Only have the data pool on spinners - For every 5 HDDs we have 1 NVME that is serving the OSD metadata. - After the reef upgrade we enable autosharding, because we run a multisite setup - Having a bucket with 503 (prime number) shards is not a problem and we scaled one bucket from 11 shards to 401 without issue
Am Mi., 4. Juni 2025 um 12:43 Uhr schrieb <[email protected]>: > Hi Michael, > > So challenges with Veeam backups are more around metadata. > > I do hope when you said enterprise SSD’s you meant NVME’s? > > By default Veeam will write everything into one bucket. The problem with > this depending on the size of your environment is you will get into bucket > sharding issues. You can also get issues with large OMAP objects. > > What happens is a bucket will get sharded when it gets to a certain number > of objects. If you know how many objects you are likely to be storing then > you can pre-shard a bucket for this larger number of objects to help > prevent this issue. > > Newer versions of Veeam do allow for using multiple buckets at the backend > which I would strongly suggest you use. > > The other large hit you get from Veeam is if you are using Object lock > then you get lots of put object retention metadata changes which can > generate a significant amount of metadata traffic. > > I would strongly suggest you put in some additional NVME’s to allow you to > move the S3 metadata onto these devices to allow for better performance. > > Darren > > > On 4 Jun 2025, at 08:28, Ml Ml <[email protected]> wrote: > > > > Hi everyone, > > > > I'm currently planning a Ceph-based S3 storage backend, with a strong > > focus on balancing performance and cost (EUR per TB). > > Fully SSD-based setups seem too expensive for my use case, while > > HDD-only configurations might be too slow. > > > > My current idea is to use a hybrid approach: > > - HDDs for bluestore_block (bulk data) > > - Enterprise SSDs for bluestore_block.db and bluestore_block.wal > > (metadata and journal) > > > > The main client of this S3 backend will be Veeam, which, as far as I > > can tell, uses larger block sizes. > > Since this results in fewer random writes and mostly sequential I/O, I > > assume this kind of workload can work well with HDDs — as long as > > metadata is fast (hence SSDs for DB/WAL). > > > > Am I on the right track with this assumption? > > > > - Do you have any benchmark results (real-world or synthetic) > > comparing HDD-only vs Hybrid setups for object workloads like Veeam? > > That would help a lot to quantify the performance difference. > > - What SSD-to-HDD ratio would you recommend? (I’ve seen 1 SSD per 3–5 > HDDs) > > > > Any advice or confirmation from those running similar setups would be > > much appreciated! > > > > So far i found: > > > https://www.hyperscalers.com/image/catalog/00-Products/Storage%20Servers/White%20Paper%20-%20Performance%20Testing%20of%20Ceph%20with%20SD1Q-1ULH.pdf > > => which mentions Ceph Verstion 0.94.5 and the performance boot does > > not seem too big here > > > > And here: > > https://www.ambedded.com.tw/en/use-case/use-case-08.html > > they dont mention such a Hybrid Setup > > > > Thanks a lot in advance! > > > > Best regards, > > Michael > > _______________________________________________ > > ceph-users mailing list -- [email protected] > > To unsubscribe send an email to [email protected] > _______________________________________________ > ceph-users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
