Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Greg Lindahl
On Fri, Jul 22, 2011 at 01:44:56AM -0400, Mark Hahn wrote: > to be honest, I don't understand what applications lead to focus on IOPS > (rationally, not just aesthetic/ideologically). it also seems like > battery-backed ram and logging to disks would deliver the same goods... In HPC, the metadat

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Mark Hahn
>>> Either way, I think if someone were to foolishly just toss together 100TB of data into a box they would have a hell of a time getting >>> anywhere near even 10% of the theoretical max performance-wise. >> >> storage isn't about performance any more. ok, hyperbole, a little. >> but even a

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Greg Lindahl
On Fri, Jul 22, 2011 at 12:33:37AM -0400, Mark Hahn wrote: > storage isn't about performance any more. ok, hyperbole, a little. > but even a cheap disk does > 100 MB/s, and in all honesty, there are > not tons of people looking for bandwidth more than a small multiplier > of that. sure, a QDR fi

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Joe Landman
On 07/22/2011 12:33 AM, Mark Hahn wrote: >> Either way, I think if someone were to foolishly just toss together >>> 100TB of data into a box they would have a hell of a time getting >> anywhere near even 10% of the theoretical max performance-wise. > > storage isn't about performance any more. ok,

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Mark Hahn
> Either way, I think if someone were to foolishly just toss together >> 100TB of data into a box they would have a hell of a time getting > anywhere near even 10% of the theoretical max performance-wise. storage isn't about performance any more. ok, hyperbole, a little. but even a cheap disk doe

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Mark Hahn
> I'm curious, has anyone tried building one of these or know > of anyone who has? a guy here built one, and it seems to behave fine. > Seems like a cheap solution for raw backup. "raw"? I think the backblaze (v1) is used for rsync-based incremental/nightly-snapshots. but yeah, this is a lot

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Greg Lindahl
On Thu, Jul 21, 2011 at 08:03:58PM -0400, Ellis H. Wilson III wrote: > Used in a backup solution, triplication won't get you much more > resilience than RAID6 but will pay a much greater performance penalty to > simply get your backup or checkpoint completed. Hey, if you don't see any benefit fro

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Ellis H. Wilson III
On 07/21/11 18:07, Greg Lindahl wrote: > On Thu, Jul 21, 2011 at 02:55:30PM -0400, Ellis H. Wilson III wrote: >> My personal experience with getting large amounts of data from local >> storage to HDFS has been suboptimal compared to something more raw, > > If you're writing 3 copies of everything

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Greg Lindahl
On Thu, Jul 21, 2011 at 02:55:30PM -0400, Ellis H. Wilson III wrote: > My personal experience with getting large amounts of data from local > storage to HDFS has been suboptimal compared to something more raw, If you're writing 3 copies of everything on 3 different nodes, then sure, it's a lot sl

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Marian Marinov
On Thursday 21 July 2011 21:55:30 Ellis H. Wilson III wrote: > On 07/21/11 14:29, Greg Lindahl wrote: > > On Thu, Jul 21, 2011 at 12:28:00PM -0400, Ellis H. Wilson III wrote: > >> For traditional Beowulfers, spending a year or two developing custom > >> > >> software just to manage big data is li

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Ellis H. Wilson III
On 07/21/11 14:29, Greg Lindahl wrote: > On Thu, Jul 21, 2011 at 12:28:00PM -0400, Ellis H. Wilson III wrote: > >> For traditional Beowulfers, spending a year or two developing custom >> software just to manage big data is likely not worth it. > > There are many open-souce packages for big data,

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Greg Lindahl
On Thu, Jul 21, 2011 at 12:28:00PM -0400, Ellis H. Wilson III wrote: > For traditional Beowulfers, spending a year or two developing custom > software just to manage big data is likely not worth it. There are many open-souce packages for big data, HDFS being one file-oriented example in the Hado

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Ellis H. Wilson III
On 07/21/11 12:09, Eugen Leitl wrote: > On Thu, Jul 21, 2011 at 11:45:28AM -0400, Douglas Eadline wrote: >> I'm curious, has anyone tried building one of these or know >> of anyone who has? >> >> Seems like a cheap solution for raw backup. I have doubts about the manageability of such large data w

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Eugen Leitl
On Thu, Jul 21, 2011 at 11:45:28AM -0400, Douglas Eadline wrote: > I'm curious, has anyone tried building one of these or know > of anyone who has? > > Seems like a cheap solution for raw backup. We use quite a few of wire-shelved HP N36L with 8 GByte ECC DDR3 RAM and 4x 3 TByte Hitachi drives w

Re: [Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Douglas Eadline
I'm curious, has anyone tried building one of these or know of anyone who has? Seems like a cheap solution for raw backup. -- Doug > > http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ > > -- > Eugen* Leitl http://leitl.org";>leitl http://leitl.org >

[Beowulf] PetaBytes on a budget, take 2

2011-07-21 Thread Eugen Leitl
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ -- Eugen* Leitl http://leitl.org";>leitl http://leitl.org __ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA