On Fri, 25 Sep 2009 at 6:09pm, Mark Hahn wrote
but since 1U nodes are still the most common HPC building block, and most of them support 4 LFF SATA disks with very little added cost (esp using the chipset's integrated controller), is there a way to integrate them into a whole-cluster filesystem?
This is something I've considered/toyed-with/lusted after for a long while. I haven't pursued it as much as I could have because the clusters I've run to this point have generally run embarrassingly parallel jobs, and I train the users to cache data-in-progress to scratch space on the nodes. But there's a definite draw to a single global scratch space that scales automatically with the cluster itself.
- obviously want to minimize the interference of remote IO to a node's jobs. for serial jobs, this is almost moot. for loosely-coupled parallel jobs (whether threaded or cross-node), this is probably non-critical. even for tight-coupled jobs, perhaps it would be enough to reserve a core for admin/filesystem overhead.
I'd also strongly consider a separate network for filesystem I/O.
- distributed filesystem (ceph? gluster? please post any experience!) I know it's possible to run oss+ost services on a lustre client, but not recommended because of the deadlock issue.
I played with PVFS1 a bit back in the day. My impression at the time was they they were focused on MPI-IO, and the POSIX layer was a bit of an afterthought -- access with "regular" tools (tar, cp, etc) was pretty slow. I don't know what the situation is with PVFS2. Anyone?
- this is certainly related to more focused systems like google/mapreduce. but I'm mainly looking for more general-purpose clusters - the space would be used for normal files, and definitely mixed read/write with something close to normal POSIX semantics...
It seems we're after the same thing. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf