I fully agree, with important or large data sets you can not be
paranoid enough.
linux and the mac allow you to easily write scripts that handle
dumping, zipping, copying (locally and elsewhere) and verifying
the data. Once written correctly and tested they can run fully
automatic with cron. Been doing this for 15 years.
And where you are advised to burn 2 DVDs, burn 5 each. Read the data
on at least two different hardwares and operating systems. Send at
least one of each by courier to a collaborating colleague on a
different continent.
As they say, different hard disk, differenr power supply, different
earthquake :-)-O
el
On 21 Oct 2008, at 21:18 , Ted Byers wrote:
[...]
Dr. Snow is right in recommending going the route of
using an RDBMS and in saying that it isn't that hard to get
started. I'd be
recommending PostgreSQL, though, since it is relatively easy to use,
and it
has pl/r (which lets you run R code within stored procedures in the
DB)
which carries obvious advantages.
[...]
If I were in his place, I'd say my data is sacred, and can not be
replaced
(just as you can't step into the same stream twice); and therefore
I'd use a
RDBMS to manage it, and the very moment it is all entered, I'd make
a backup
of both the data (e.g. in MySQL I'd use mysqldump) AND the software,
and
copy both backups to two CDs or DVDs. And, if the data were
originally
recorded on paper, I'd be scanning the pages and copying those
images onto a
couple CDs or DVDs also: with two copies on optical media, one copy
can be
stored in a fireproof vault while the other is in the office ready
to be
used should a HDD fail, or some other disaster interrupt my work.
OK, so
I'm paranoid about my data, but I'd rather go the extra mile than risk
losing it.
--
Dr. Eberhard W. Lisse \ / Obstetrician & Gynaecologist (Saar)
[EMAIL PROTECTED] el108-ARIN / * | Telephone: +264 81 124 6733 (cell)
PO Box 8421 \ / Please send DNS/NA-NiC related e-mail
Bachbrecht, Namibia ;____/ to [EMAIL PROTECTED]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.