Evening
Thank you for this snapshot of 'real world' Lustre use. At the risk of hijacking this thread (or borrowing it...) could I ask you a question about Lustre? I've always been interested in Lustre but never used it.
I strongly suggest you grab a few old boxes and play with it, it really is very good. 1.6betas are a little unstable, but 1.4 is very very solid. A tad fiddly to setup, but not really that hard. 1.6 definitely is nice to setup and use.
Like everyone in this mailing list I am interested in a distributed filesystem whose bandwidth and speed are commensurate with the total raw hardware IO performance of the disks and the network speed and intersection bandwidth. But there are two additional features that I also think would be very desirable: (1) RAID-across-nodes. For example every ten nodes form a redundant RAID set. The disappearance of any one of these nodes causes no data loss, service loss, or corruption at the user level. The total redundant storage available from the ten nodes is 90% of the available raw storage.
No, Lustre does not currently support this. There are lots of ways you could acheive this (as mentioned in other emails), but they will all reduce bandwidth :) It is definitely on the Lustre road map to deliver RAID across servers, but it isn't there yet. Having said that, there is nothing stopping you raiding the disks within a node. But, as I keep saying, your NFS servers don't do this either ;)
(2) Symmetry: all nodes have identical behavior and features. There are no specialized IO or metadata nodes, which act as filesystem bottlenecks and which are single points of failure.
No, this is not really what Lustre is trying to acheive. But, it does allow you to have fail over in the servers and meta-data servers. So, if one crashes, another will take over. On the roadmap will be clustered meta-data servers... but again, its not there yet. Um... did I mention your NFS servers?
Am I correct that Lustre does not offer either of these features? Do you (or does someone else) know if there is an open-source or commercial distributed (posix) filesystem with these features? Cheers, Bruce
I think there are some opensource projects (glusterfs?) that claim to do this, but I suspect their bandwidth is nothing aproaching lustre... and probably for all they claim, their meta-data performance probably won't match lustre either. With Lustre 1.6 I was seeing 170MB/s sustained from single clients to the lustre storage. That's pretty impressive given two NIC's in the client... and I didn't even play with the tuning parameters or jumbo frames etc. That was straight out of the box. With 6 OSS's the agregate bandwidth with 1.6 was ~1GB/s... and it happily ran with 16 instances of Bonnie++ hammering away on it for a week. 1.4 is slightly down on bandwidth... but stable :) I've since been told that with tuning you can get 1.4 to perform as well as 1.6. I really think people should try Lustre. A lot of people were put off in the early days cause their was little documentation... there were few tools to help configure/mount etc. BUT, with 1.4, it is a very nice product. If you can afford Elan, then you will be in for a very nice experience. Stu. -- Dr Stuart Midgley [EMAIL PROTECTED] _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf