Re: [Beowulf] (no subject)

2007-02-16 Thread John Bushnell
Hi, I used to do similar kinds of backups on our smallish clusters, but recently decided to do something slightly smarter, and have been using rsnapshot to do backups since. It uses rsync and hard links to make snapshots of /home (or any filesystem you want) without replicating every single by

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-16 Thread Justin Moore
Despite my Duke e-mail address, I've been at Google since July. While I'm not a co-author, I'm part of the group that did this study and can answer (some) questions people may have about the paper. Dangling meat in front of the bears, eh? Well... I can always hide behind my duck-blind-sla

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-16 Thread Jim Lux
At 12:50 PM 2/16/2007, David Mathog wrote: Eugen Leitl <[EMAIL PROTECTED]> wrote: > http://labs.google.com/papers/disk_failures.pdf Interesting. However google apparently uses: serial and parallel ATA consumer-grade hard disk drives, ranging in speed from 5400 to 7200 rpm Not quite clear

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-16 Thread Mark Hahn
Is there any info for failure rates versus type of main bearing in the drive? I thought everyone used something like the "thrust plate" bearing that seagate (maybe?) introduced ~10 years ago. Failure rate vs. drive speed (RPM)? surely "consumer-grade" rules out 10 or 15k rpm disks; their col

Re: [Beowulf] Teraflop chip hints at the future

2007-02-16 Thread Joe Landman
Richard Walsh wrote: >Here you are arguing for an ASIC for each typical HPC kernel ... ala > the GRAPE processor. I will buy that ... but >a commodity multi-core, CPU is not HPC-special-purpose or low power > compared to an FPGA. FPGA power is good, several Watts in most cases. When y

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-16 Thread Robert G. Brown
On Fri, 16 Feb 2007, David Mathog wrote: Justin Moore wrote: Subject: Re: [Beowulf] failure trends in a large disk drive population To: Eugen Leitl <[EMAIL PROTECTED]> Cc: Beowulf@beowulf.org Message-ID: <[EMAIL PROTECTED]> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed http://lab

Re: [Beowulf] failure trends in a large disk drive population

2007-02-16 Thread Robert G. Brown
On Fri, 16 Feb 2007, Mark Hahn wrote: http://labs.google.com/papers/disk_failures.pdf this is awesome! my new new-years resolution is to be more google-like, especially in gathering potentially large amounts of data for this kind of retrospective analysis. thanks for posting the ref. Yea

Re: [Beowulf] (no subject)

2007-02-16 Thread Mark Hahn
not buy a tape drive for backups. Instead, I've got a jury-rigged backup tapes suck. I acknowlege that this is partly a matter of taste, experience and history, but they really do have some undesirable properties. scheme. The node that serves the home directories via NFS runs a nightly tar

Re: [Beowulf] Teraflop chip hints at the future

2007-02-16 Thread Richard Walsh
Jim Lux wrote: At 07:03 AM 2/13/2007, Richard Walsh wrote: Yes, but how much does it really abandon von Neumann. It is just a lot of little von Neumann machines unless the mesh is fully programmable and the DRAM stacks can source data for any operation on any cpu as the application's data flows

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-16 Thread David Mathog
Justin Moore wrote: > Subject: Re: [Beowulf] failure trends in a large disk drive population > To: Eugen Leitl <[EMAIL PROTECTED]> > Cc: Beowulf@beowulf.org > Message-ID: <[EMAIL PROTECTED]> > Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed > > > > http://labs.google.com/papers/disk_fai

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-16 Thread Joe Landman
Joe Landman wrote: > Quite useful IMO. I know it would be PC, but I (and many others) would s/PC/non-PC/ my fault -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: [EMAIL PROTECTED] web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 845

Re: [Beowulf] Re: failure trends in a large disk drive population

2007-02-16 Thread Joe Landman
Hi David David Mathog wrote: > Eugen Leitl <[EMAIL PROTECTED]> wrote: > >> http://labs.google.com/papers/disk_failures.pdf > > Interesting. However google apparently uses: > > serial and parallel ATA consumer-grade hard disk drives, > ranging in speed from 5400 to 7200 rpm > > Not quite c

Re: [Beowulf] (no subject)

2007-02-16 Thread Peter St. John
Nathan, You might experiment with the flags to cp; e.g. I might try cp -i -v The -i will prompt you when it wants to overwrite an existing file (maybe at 142MB in you are getting a permissions error) and -v is verbose (so maybe it will stop failing silently). You also might want to specify stderr

[Beowulf] (no subject)

2007-02-16 Thread Nathan Moore
Hello all, I have a small beowulf cluster of Scientific Linux 4.4 machines with common NIS logins and NFS shared home directories. In the short term, I'd rather not buy a tape drive for backups. Instead, I've got a jury-rigged backup scheme. The node that serves the home directories vi

[Beowulf] Re: failure trends in a large disk drive population

2007-02-16 Thread David Mathog
Eugen Leitl <[EMAIL PROTECTED]> wrote: > http://labs.google.com/papers/disk_failures.pdf Interesting. However google apparently uses: serial and parallel ATA consumer-grade hard disk drives, ranging in speed from 5400 to 7200 rpm Not quite clear what they meant by "consumer-grade", but I'm

Re: [Beowulf] Teraflop chip hints at the future

2007-02-16 Thread Li, Bo
Intel has been promote a conception Many Cores and Small Cores for Teraflop chip, which was reported recently. I have done some on Cell programming and optimization. Many-Core architectures will be a bit difficult for programmers, not for its algorithms but for its inter-connection. When 80 core

[Beowulf] HotI 2007 Call for Papers

2007-02-16 Thread Weikuan Yu
Apologies for _multiple_ copies == Hot Interconnects 15 IEEE Symposium on High-Performance Interconnects August 22-24, 2007 Stanford University

[Beowulf] DLM internals

2007-02-16 Thread Sudhakar G
Hi, Can any one let me know how DLM (Distributed Lock Manager) works. The internals of it. ie., whether the logic of granting of locks is centralised or distributed. If distributed how? Thanks Sudhakar ___ Beowulf mailing list, Beowulf@beowulf.org To c

Re: [Beowulf] failure trends in a large disk drive population

2007-02-16 Thread Justin Moore
http://labs.google.com/papers/disk_failures.pdf Despite my Duke e-mail address, I've been at Google since July. While I'm not a co-author, I'm part of the group that did this study and can answer (some) questions people may have about the paper. -jdm Department of Computer Science, Duke

Re: [Beowulf] failure trends in a large disk drive population

2007-02-16 Thread Mark Hahn
http://labs.google.com/papers/disk_failures.pdf this is awesome! my new new-years resolution is to be more google-like, especially in gathering potentially large amounts of data for this kind of retrospective analysis. thanks for posting the ref. _

[Beowulf] failure trends in a large disk drive population

2007-02-16 Thread Eugen Leitl
http://labs.google.com/papers/disk_failures.pdf -- Eugen* Leitl http://leitl.org";>leitl http://leitl.org __ ICBM: 48.07100, 11.36820http://www.ativel.com 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE ___