Mark, I use to make experiments with my toy cluster of PS3. and I'm interested in your ideas. PS3 has two network interfaces - GLAN NIC and Wi-fi. Available for running with my firmware 2.70 distros are: Yellow Dog Linux 6.1 NEW, PSUBUNTU (Ubuntu 9.04), Fedora 11. There're Allied Telesyn AT-GS900/8E switch and D-Link DIR-320 Wi-fi router with USB. Except this systems there're Core2Duo E8400/ 8GB RAM/ 1.5 TB HDD and Celeron 1.8/ 1 GB RAM/ 80 GB HDD. While I'm waiting when my partner-programmer realize ILP64-scheme in his CFD-package, PS3 stay without any work and they are ready for any experiments with their HDD's of 80 GB. I would prefer not to load their GLAN NICs with something except MPI, but may be it's possible to use wi-fi? There're two PS3's, but it's suffice for an experiment.
Dmitry Zaletnev > > users to cache data-in-progress to scratch space on the nodes. But there's > > a > > definite draw to a single global scratch space that scales automatically > > with > > the cluster itself. > using node-local storage is fine, but really an orthogonal issue. > if people are willing to do it, it's great and scales nicely. > it doesn't really address the question of how to make use of > 3-8 TB per node. we suggest that people use node-local /tmp, > and like that name because it emphasizes the nature of the space. > currently we don't sweat the cleanup of /tmp (in fact we merely > have the distro-default 10-day tmpwatch). > > > - obviously want to minimize the interference of remote IO to a node's > > > jobs. > > > for serial jobs, this is almost moot. for loosely-coupled parallel jobs > > > (whether threaded or cross-node), this is probably non-critical. even for > > > tight-coupled jobs, perhaps it would be enough to reserve a core for > > > admin/filesystem overhead. > > I'd also strongly consider a separate network for filesystem I/O. > why? I'd like to see some solid numbers on how often jobs are really > bottlenecked on the interconnect (assuming something reasonable like DDR IB). > I can certainly imagine it could be so, but how often does it happen? > is it only for specific kinds of designs (all-to-all users?) > > > - distributed filesystem (ceph? gluster? please post any experience!) I > > > know it's possible to run oss+ost services on a lustre client, but not > > > recommended because of the deadlock issue. > > I played with PVFS1 a bit back in the day. My impression at the time was > yeah, I played with it too, but forgot to mention it because it is afaik > still dependent on all nodes being up. admittedly, most of the alternatives > also assume all servers are up... > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf