On 06/13/2012 06:40 AM, Bernd Schubert wrote: > What about an easy to setup cluster file system such as FhGFS?
Great suggestion. I'm all for a generally useful parallel file systems instead of torrent solution with a very narrow use case. > As one of > its developers I'm a bit biased of course, but then I'm also familiar I think this list is exactly the place where a developer should jump in and suggest/explain their solutions as it related to use in HPC clusters. > with Lustre, an I think FhGFS is far more easiy to setup. We also do not > have the problem to run clients and servers on the same node and so of > our customers make heavy use of that and use their compute nodes as > storage servers. That should a provide the same or better throughput as > your torrent system. I found the wiki, the "view flyer", FAQ, and related. I had a few questions, I found this link http://www.fhgfs.com/wiki/wikka.php?wakka=FAQ#ha_support but was not sure of the details. What happens when a metadata server dies? What happens when a storage server dies? If either above is data loss/failure/unreadable files is there a description of how to improve against this with drbd+heartbeat or equivalent? Sounds like source is not available, and only binaries for CentOS? Looks like it does need a kernel module, does that mean only old 2.6.X CentOS kernels are supported? Does it work with mainline ofed on qlogic and mellanox hardware? From a sysadmin point of view I'm also interested in: * Do blocks auto balance across storage nodes? * Is managing disk space, inodes (or equiv) and related capacity planning complex? Or does df report useful/obvious numbers? * Can storage nodes be added/removed easily by migrating on/off of hardware? * Is FhGFS handle 100% of the distributed file system responsibilities or does it layer on top of xfs/ext4 or related? (like ceph) * With large files does performance scale reasonably with storage servers? * With small files does performance scale reasonably with metadata servers? BTW, if anyone is current on any other parallel file system I'd (and I suspect others on list) would find it very valuable. I run a hadoop cluster, but I suspect there are others on list that could provide better answer than I. My lustre knowledge is second hand and dated. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf