On Aug 18, 2012, at 1:04 PM, Andrew Holway wrote: > 2012/8/17 Vincent Diepeveen <d...@xs4all.nl>: >> The homepage looks very commercial and they have a free trial on it. >> You refer to the free trial? > > http://nexentastor.org/ - Sorry wrong link. Its a commercially backed > open source project. > >> Means buy raid controller. That's extra cost. That depends upon what >> it costs. > > You just need to attach the disks to some SATA port. ZFS does all the > raid stuff internally in software. > >> But it does mean that every node and every diskread and write you do, >> that they all hammer at the same time at that single basket. > > ZFS seems to handle this kind of stuff elegantly. As with hardware > raid each disk can be accessed individually. NFSoRDMA would ensure > speedy access times. > >> >> I don't see how you can get out of 1 cheap box good performance like >> that. > > Try it and see. It would certainly be much less of a headache than > some kind of distributed filesystem which, in my opinion is complete > overkill for a 4 node cluster. All of the admins that I know that look > after these systems have the haunted look of a village elder that must > choose which of the villages daughters must be fed to the lustre > monster every 6 months. > > Dont forget to put in as much memory as you can afford and ideally an > SSD For read cache (assuming that you access the same blocks over and > over in some fashion)
I designed something myself in datastructure that's close to ZFS according to someone at the time working for Sun in Bangalore; this was before ZFS was popular, or even introduced (am not sure - it was 2001-2002 or so), but am no aware how the filesystem has been expanded since then to satisfy professional needs :) Note i wasn't aware it works sincethen in Linux as opensource. Does it? My thing is streaming a dataset of around a 1.3TB over and over again and each time something in the dataset gets modified. So the output is a bitstream that you store and this bitstream is, all cores together storing 1.3TB or so. Note if i write TB it's terabyte. All those raidcards write Gb = gigaBIT. 1.3TB of SSD a node would speed it up considerable, but that's too expensive. I do agree about maintenance, but my cluster ain't larger than 8 nodes here and i do want that performance of 0.5TB/s a node, so in case of 8 nodes it should be 4GB/ s agreggated bandwidth to the i/o and not the say nearly 800MB/s that most raidcards, that are cheap on ebay, seem to deliver. So some sort of distributed file system seems the best option, and a lot cheaper and a lot faster than a dedicated fileserver that will not be able to keep up. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf