On 11/13/12 19:00, Bill Broadley wrote: > > If you need an object store and not a file system I'd consider hadoop.
Eeek -- for .5MB to 10MB files is anathema for Hadoop. As much as I love Hadoop, there's a tool for every job and I'm not sure this one quite fits for those file sizes. If you had a decent chunk of larger files (i.e. > 64MB at the very least, ideally like 1GB files on average), Hadoop might work. The specific use of the file system seems particularly relevant to this discussion, so if you can figure out some more hard and fast ideas about the ways in which your storage will be actually used, we'll probably have a better idea of what suggestion to offer. IMHO, it's not the storage of that size of data annually that makes this a hard problem -- it's what you want to do with it (and how fast). If you never want to look at it again, and you're receiving that 1PB over the duration of the year in a steady fashion, you'll note that this boils down to around 34MB/s. Pretty easy for any parallel file system (or really, even a slow individual HDD, provided you just continued onto the next one once you filled the current one). This becomes interesting if you need to handle big bursts of writes, big bursts of reads, reads of the whole (or large portions of the) data set, etc, etc. Again, knowing what you need will help us a lot here. Best, ellis _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf