I tested several NFS server configurations with a 19 node cluster. The first advise is to stay away from redhat for file servers since they have some bursty I/O bugs and don't support XFS. We use SLES10 for fileservers. That is probably the only real advise you can generalize here, so take the following with a grain of salt:
The following benchmark series was not conducted as marketing material and was not optimized for fastest possible throughput, but rather to highlight the impact of using SLES10 vs. RHEL4 as well as vanilla fibre card + trunked onboard gige vs. montillo for otherwise identical default configurations. Note that we used sync nfs not async, and that sync, not async NFS, writing large (twice as much as RAM in the client) files streaming read/write with dd, blocksize 1M (broken down by the OS to a smaller size). Each NFS server was dual fibre attached to a Xyratex F5402E with two 6 drive raid5 volumes (7.2k-rpm SATA drives), using LVM2 to stripe across the two luns. If write speed and scalability to large amount of nodes is of importance, and your i/o patterns happen to match what I tested with dd (large file streaming read/write) then the results might tell you that investing into a Montillo Rapidfile NFS offloading engine pays off. If read speed is your only concern, you can do better without. 1. NFS server with SLES10, QLA2462, XFS, gige 4x 1G port trunked (mode0). - Single client to NFS server is about 40MB/s write, 80MB/s read - 19 compute nodes in parallel: 25MB/s write aggragate and 109MB/s read aggragate - Single dd within the NFS server directly to the fibre attached filesystem: 148MB/s write, 177MB/s read 2. NFS server with SLES10+Montillo RapidFile NFS offloading engine, XFS, 2x 1G port trunked (mode0) - Single client to NFS server 85 MB/s write, 95MB/s read - 19 compute nodes in parallel: 43MB/s write aggragate, 90 MB/s read aggragate - Single dd within the NFS server directly to the fibre attached filesystem: 140MB/s write, 220MB/s read 3. NFS server with RHEL4+Montillo+ext3 otherwise as above: - Single client to NFS server 24MB/s write, 54MB/s read - 19 compute nodes in parallel: 29MB/s write aggragate, 69MB/s read aggragate - Single dd within the NFS server directly to the fibre attached filesystem: 78MB/s write, 84MB/s read It's kind of frustrating that the NFS net bandwidth is so much below what we see locally on the fibre attached filesystem, and it can only partially be explained by the clients dual onboard NIC's with one odd and one even mac address each, which means using eth0 on all compute nodes results in hasing onto the same server-trunked gige port... Michael -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Glen Dosey Sent: Thursday, August 23, 2007 10:12 AM To: Beowulf Subject: Re: [Beowulf] Network Filesystems performance Perhaps I should just ask the simple question. Does anyone on-list obtain greater than 40 MB/s performance from their networked filesystems ( when the file is not already cached in the servers memory ) ? (Yes it's a loaded question because if you answer affirmatively, then I know who to interrogate with further questions :) _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf