On Thu, Jun 04, 2009 at 05:33:32AM +0200, Bogdan Costescu wrote: > On Wed, 3 Jun 2009, John Hearns wrote: > >> Quite often the architecture of storage is a secondary consideration, >> in the rush to get a Shiny New Fast machine on site and working. > > Well, I've seen it ignored even outside of that rush - in the design > phase. And I confess of being guilty of doing this as well, but I learn > from mistakes :-) > ..... > > I see duplication of data in oalmost all cases as a human behaviour > problem, not a technical one, which needs human behaviour solutions and > not technical ones, so policies are a good solution.
Take this list for example. We each get our own copy and at times get multiple copies as a side effect of replies. One key here is the lack of 'caching' tools for mail and for HPC in the I/O filesystem space. There are multiple issues that make this hard, some technical some social, some habititual. On the habitual side, I was recently looking at a CS homework assignment and noticed that the primary instruction began with "copy" both code and "data" and then ended with "copy code and data" to submit the homework assignment result. The low budget answer today is a human behaviour solution... longer term solutions will need to understand the "data flow" and "data state" of a lot of replicated things (example mail and attachments) a lot better including the "off line" state, multiple keyboards (home/ work) and connectivity and connectivity quality state. It is possible that HPC tools and mail could evolve toward a Mecurial view (revision control) of data. This in turn implies a longer reach for access control and access policy tools. -- T o m M i t c h e l l Found me a new hat, now what? _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf