[ceph-users] Re: Performance issues

darren Sat, 02 Aug 2025 09:13:46 -0700

As you rightly point out the 110MB/s it sounds very much like the traffic is 
going through the wrong interface or being limited.


So am I correct in my reading of this that this a virtual Ceph environment 
running on Proxmox?

What do you mean by this statement? " All Ceph drives are exposed and an NFS 
mounted NVME drive. “

Do I take this to mean that you 4 servers are all mounting the same NVME device 
over NFS? Just a bit confused as to the exact hardware setup here.

What is the performance you can get from a single Ceph OSD? Just do a simple dd 
to read not write to an OSD drive.

Darren
 

> On 2 Aug 2025, at 15:25, Ron Gage <[email protected]> wrote:
> 
> Hello from Detroit MI:
> 
> I have been doing some limited benchmarking of a Squid cluster. The 
> arrangement of the cluster:
> Server        Function
> c01             MGR, MON
> c02             MGR, MON
> o01            OSD
> o02            OSD
> o03            OSD
> o04            OSD
> 
> Each OSD has 2 x NVME disks for Ceph, each at 370 Gig
> 
> The backing network is as follows:
> ens18        Gigabit, mon-ip (192.168.0.0/23) regular MTU (1500)
> ens19        2.5 Gigabit, Cluster Network (10.0.0.0/24) Jumbo MTU (9000)
> 
> Behind all this is a small ProxMox cluster.  All Ceph machines are running on 
> a single node.  All Ceph drives are exposed and an NFS mounted NVME drive.  
> All Ceph OSD drives are mounted with no cache and single controller per 
> drive.  Networking bridges are all set to either MTU 9000 or MTU 1500 as 
> appropriate.
> 
> iPerf3 is showing 2.46 Gbit/sec between servers c01 and o01 on the ens19 
> network.  Firewall is off all the way around.  OS is CentOS 10.  SELinux of 
> disabled.  No network enhancements have been performed (increasing send/rcv 
> buffer size, queue length, etc).
> 
> The concern given all this: rados bench can't exceed 110 MB/s in all tests.  
> In fact if I didn't know better I would swear that the traffic is being 
> either throttled or is somehow routing through a 1Gbit network.  The numbers 
> that are returning from rados bench are acting like saturation at Gigabit and 
> not exhibiting any evidence of being on a 2.5 Gbit network.  Monitoring at 
> both Ceph and ProxMox consoles confirm the same.  Cluster traffic is 
> confirmed to be going out ens19 - tested via tcpdump.
> 
> Typical command line used for rados bench: rados bench -p s3block 20 write
> 
> What the heck am I doing wrong here?
> 
> Ron Gage
> 
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Performance issues

Reply via email to