Package: debian-reference Version: 1.08-2 Severity: wishlist Tags: patch Greetings Osamu and everyone that works on debian-reference. Thank-you for your wonderful package. A paragraph in section 8.4 reads:
Combination of one of these with the archiving method described in Copy and archive a whole subdirectory, Section 8.3 and the automated regular job described in Schedule activity (cron, at), Section 8.6.27 will make a nice backup system. This document is about creating such a system and is meant for inclusion in debian-reference. It uses regular gnu tar rather than pdumpfs or subversion. I've looked at many of the 'backup utilities' but these systems have thus far proved too complicated for me or not flexible enough to fill my needs. This work is partly inspired by http://www.bluelavalamp.net/backerupper/ which did not work for me, because when you are backing up 600 GB the last thing you are going to be able to do is make a complete archive of everything *before* scp'ing it somewhere, and the tar info page which has a great section on using tar for backups. This is still a bit of a work in progress, so feel free to ask for changes. This system is for people for whom all those CD/DVD based systems would be too time consuming (and expensive) [600GB ~= 1000CDs] and can't afford a tape based system. But do have an old system lying around and some extra hard disks (or could afford the one-time expense of some new disks.) :) I reserve the right to turn this entirely into a doc about using a pre-canned backup system if I can find one that fits my needs. :) Maybe rsync... I expect this information to fit in under the current 8.3 and 8.4 sections. I am writing this specifically for debian-reference so it is naturally available under the GPL, or whatever we need it to be under. Change what you will to make it work for debian-reference. I want to be a contributor, I want to help out. This document has been created using outlines in emacs. Consequently the headings can be folded and more easily managed. * moving files [8.3 additions First some stuff that could be added to section 8.3 . Note that I use `tar c .` instead of `tar cf - .` as gnu tar uses stdout by default.] Sometimes when dealing with special files, permission, date/time stamps, and ownerships; cp and scp will not work properly. It's times like that you can turn to tar as the most robust way of moving files from place to place. Copy a filesystem from one part of the tree to another. $ cd /fromdir && tar c . | (cd /todir && tar x) Or to a different machine: $ tar c . | ssh muggles "cd /todir && tar x" [See section 9.5 for more information on ssh, and especially 9.5.3 as many of these commands will fail unless unless ssh is able to authenticate without user interaction.] Or from a different machine: $ ssh chaljin "cd /fromdir && tar c ." | tar x Tar can also be used to confirm that two subdirectories are identical: $ cd /fromdir; tar c . | (cd /todir && tar d) * pipes [But here I start diverging from 8.3 by talking about dd and pipes. Really, I'm giving background for the machinery in the next section. I don't think there is a section on pipes yet so: ] [ 8.3&1/2 Using pipes ] A dd conduit between systems: $ ssh chaljin "dd if=bin/iSilo386 bs=10k" | dd of=iSilo receive a tar file from another system $ ssh chaljin "tar c bin" | dd bs=10k of=test.tar from one tar to another $ ssh chaljin "tar c bin" | tar x A dd conduit between two forien systems: $ ssh chaljin "dd if=bin/iSilo386 bs=10k" | ssh muggles "dd of=iSilo" Transfer the contents of the bin directory between two foriens. The pipe goes through the local system so it's not efficient for big backups, this is only to demonstrate its flexibility. $ ssh chaljin "cd bin; tar c ." | ssh muggles "cd v; tar x" A pipe is very flexible. You can pick any From Side for the pipe and combine it with any To Side maximizing your options. ** From Side Examples: A file/device on the local system. dd if=/my/file.tar bs=10k | Or. cat file.tar | A file/device on a forien system. ssh chaljin "dd if=bin/iSilo386.exe bs=10k" | The output of tar. tar c . | The output of tar on a forien system in a different directory. ssh chaljin "cd /from/dir; tar c ." | Real world example: ssh [EMAIL PROTECTED] "cd /etc/cron; tar -c \ -g snapshot \ -X exclude \ / | dd obs=20KiB" | ** To Side Examples: A local file/device. | dd of=/my/newfile Append to a forien file. | ssh muggles "dd >> /myforien/newfile" Unpack a tarball on a forien system in a different directory. | ssh muggles "cd /to/dir; tar x" Real world example: | dd of=/home/backerupper/chaljin/$(date +%Y-%m-%d).tar" * Tutorial: Running it all from cron [8.4 additions] People often tell you to make backups of you system. This document details /how/ to back up and restore your systems with minimal fuss. ** Justification [I don't recommend including the justification in debian-reference. It's too wordy and I don't think anyone will care.] As a slight side track, or snake's hands, I will attempt to justify letting the backerupper machine have full root access to the clients being backed up. Some argue that this is a security bug and want each client to control what is backed, how, and when. Doing so adds considerable complexity to the scheme. Each new client needs to be configured and if the client has a different operating system the entire scheme must be ported or rewritten to the new OS, where as doing it all on the server eliminates the need for most of these things. The point of the opposition in not giving the backerupper root level access is to prevent a security compromise to backerupper from propagating to all the clients. Unfortunately, not explicitly giving backerupper explicit root access does not prevent anyone who has root access on backerupper from gaining root access to all the clients. If backerupper is your secure logger and offers no access or services the additional risk is reduced. Perhaps it is even better than allowing each client file access to backerupper and trying to avoid the possibility of overwriting attacks. Only by encrypting the data on backerupper might the tables be turned on this argument. 1. Essential client security files are on backerupper. If /etc/passwd and /etc/shadow are delivered to backerupper it is possible to extract passwords using a tool such as 'john the ripper'. 2. Once backerupper is compromised any restores done from it could introduce trojans or back doors, there by compromising the client. I believe that using a secure logger and backerupper is in fact more secure than using an intelligent client model, but I am not a security professional and offer this as merely my opinion so naturally I can guarantee noting. If you consider the sensitivity of the clients with respect to the backerupper under any model the backerupper is both the more sensitive and the more easy to secure. ** Background There are many backup systems available. Unfortunately, I'm very simple and they were either too complex for me to fathom or not flexible enough for my needs. If you have a data set containing hundreds of gigabytes or complex database driven services then this backup method could be for you. It uses commonly available tools and hardware, mainly GNU tar and a spare computer. This method doesn't create duplicates of the data before archiving so it works when backing up a terabyte raid that only has 10 MB of free space. A backup archive can be stored anywhere but I suggest using the hard disk of a separate machine as disk space is both reusable and inexpensive. Using a separate system adds an additional level of security protecting your data. I use only one machine that functions as a backup system to all my other computers. This backup system makes a good secure logger [1] if you need such a thing. The directories on a Debian system to backup are /var /home /etc /root and possibly /opt and /usr/local or any other filesystem you hand tweaked, for instance /boot if you are using a custom kernel, or /usr/src if you do custom build work there. As long as you have followed the Filesystem Hierarchy Standard [2] this list will be quite short and prevent you from having to back up any system managed executables or files. If you would like to ease bare metal recovery save some additional information about the target system such as partition information, installed hardware, and needed drivers; especially those for drive system and network access; in the archive data directory. ** Requirements and Procedure The client system needs to support ssh, scp, and have a copy of gnu tar installed. Everything else is handled by the backup system. The backup system will need cron, tar, ssh, and scp. In our example, the backup machine is named 'muggles' and the server being backed up is named 'chaljin'. Create a 'backerupper' user on the backup machine and generate a ssh key for him. Put his public key in root's authorized-keys file on the machine to be backed up. [See section 9.5.3] This makes him equivalent to root on the target system. He will need this access to save and restore your critical data. Add these cron lines to /etc/crontab of your backup system. This zero the snapshot file for the target machine, which triggers a full backup, and run the backup script. Here a full backup is done quarterly and incremental backups are done weekly. #m h dom month dow user command 03 1 1 */3 * backerupper echo "" > /home/backerupper/chaljin/snapshot 04 1 * * 0 backerupper /home/backerupper/chaljin/script.sh There are several files needed for this system to operate. script.sh snapshot (generated by tar) exclude muggles $ cat /home/backerupper/chaljin/script.sh #!/bin/bash # Save the debconf database ssh [EMAIL PROTECTED] "debconf-get-selections > /var/backups/debconf-selections" # Save the package selections ssh [EMAIL PROTECTED] "dpkg --get-selections "*" > /var/backups/dpkg-selections" # This system runs some postgresql databases so we save those to a stable # state. # Full and incremental backup of everything except /proc /dev /mnt # /cdrom and /floppy (as specified in /etc/cron/exclude) to the remote # computer backerupper. I typically don't use compression as much of # the stuff in my archives don't compress well, and gzip doesn't deal # very well with already compressed data. [Running gzip on a # compressed files is really slow and the file just gets larger.] # Also, if you have large data, such as photographs, embedded in a # postgres database pg_dumpall will not preserve it. In that case you # will need to add pg_dump commands for the individual databases. BACKUP=$(date +%Y-%m-%d) ssh [EMAIL PROTECTED] "su - postgres; pg_dumpall > /var/backup/pgdb.dump if [ -f /home/backerupper/chaljin/snapshot ]; then scp /home/backerupper/chaljin/snapshot [EMAIL PROTECTED]:/var/backup/snapshot; fi scp /home/backerupper/chaljin/exclude [EMAIL PROTECTED]:/var/backup/exclude ssh [EMAIL PROTECTED] "cd /etc/cron; tar -c \ -g /var/backup/snapshot \ # --gzip \ # -X /var/backup/exclude \ /etc /root /home /var /opt /usr/local" \ | dd of=/home/backerupper/chaljin/$BACKUP.tar" scp [EMAIL PROTECTED]:/var/backup/snapshot /home/backerupper/chaljin/snapshot tar tvf /home/backerupper/chaljin/$BACKUP.tar \ > /home/backerupper/chaljin/$BACKUP.listing muggles $ [Side note to Osamu: You might want to add a blurb about 'debconf-get-selections' and 'debconf-set-selections' to 6.4.9] [I've had problems with the blocking factor (-b) in tar, which seems to have no impact when I use it. `tar c /home | dd obs=10KiB | dd bs=10KiB count=100` seems to indicate that the blocking tar uses (regardless of the -b$NUM option) is 512 bytes.] In the storage directory I have /home/backerupper/chaljin/exclude . This is copied to the client being backed up, in our example: chaljin, just before tar will need it. Any file matching one of the patterns listed in exclude will not make it into the archive. See tar's info pages for details. In this example we don't want to attempt backing up the live postgres databases as it will be unsucessful, doing so is handled by the pg_dumpall command in our script.sh . Similar adjustments need to be made for email, news, and other database driven services. muggles $ cat /home/backerupper/chaljin/exclude /home/archive/* /home/share/* /var/lib/postgres/data/ muggles $ To be able to do a bare metal restore we'll need to know any needed network drivers, machine name, partition layout, logical volume management, booting strap system (such as grub or yaboot on powerpc). Right now you should note this kind of information in a file ( muggles:/home/backerupper/chaljin/Notes.txt ) on the backup system. ** Full Restore To reconstruct the system after catastrophic failure install a base debian system and then restore the rest from your backups. 1. Reinstall Debian 2. Restore information from backup machine. tar extractions done by root preserve the owner and the permissions of files: muggles $ cd /home/backerupper/chaljin/ muggles $ dd < full.tar | ssh [EMAIL PROTECTED] "cd /; tar -x" muggles $ dd < inc1.tar | ssh [EMAIL PROTECTED] "cd /; tar -x" muggles $ dd < inc2.tar | ssh [EMAIL PROTECTED] "cd /; tar -x" muggles $ ssh [EMAIL PROTECTED] chaljin # apt-get update chaljin # dpkg --set-selections </var/backup/dpkg-selections chaljin # debconf-set-selections </var/backup/debconf-selections chaljin # su - postgres chaljin $ psql -f /var/backup/pgdb.dump template1; exit At this point you might need to iron out any changes that have occured in Debian since you last updated your server. Running 'aptitude' or 'dselect' at this point might be wise. Also if Debian were perfect the packaging system would be able to deal with administrator modified configuration files, as it is you will need to iron out the conflicts just as one must when preforming any normal upgrade. If you are restoring a powerpc based Debian check that your new partitions match what you have in /etc/yaboot.conf before running ybin, which you might need to wait until after you finish the upgrade to do. chaljin # apt-get upgrade ** Partial Restore From our archive listing (/home/backerupper/chaljin/2005-02-06.listing) we learn which files we want to restore. For me this file is greater than 50 MB, only use an editor such as vim which is capable of dealing with such a large text file. Emacs lags at bit but can handle it, do not try this in MSWord unless you wanted to reboot your machine anyway or can afford to wait for a long time. If you only want to restore a few files I suggest extracting the files from the archive on the backup system and copying them by hand to the target. muggles $ cd /home/backerupper/chaljin/ muggles $ mkdir scratch; cd scratch muggles $ tar xf ../2005-02-06.tar /etc/netatalk /home/frodo/important.txt muggles $ scp -R * [EMAIL PROTECTED]:/ Or, if you need to preserve timestamps & permissions: muggles $ tar c . | ssh [EMAIL PROTECTED] "cd /; tar x" If you have a moderately large set you may be better off telling tar which files you want by listing them in a file. This example will extract all files who's complete path name matches one of the patterns in 'list'. muggles $ cat list *thesis* home/cira/blue home/cami/red muggles $ tar xf 2005-02-06.tar -T list ** Looking up old data without actually restoring it. At this point I use tar to search for and pull out files I'm interested in. It's slow operating on 20-120 GB archives but I just live with it at this point. One possibility here might be to extract the archive into some kind of a compressed file system. Perhaps cloop, see: the cloop-utils package. Or squashfs-tools. File-roller gives you a nice gui to look through the tar balls with, but again it's slow working on large tar files. For me, it took more than 60 minutes to open one of my 120GB full dumps in file-roller. * Notes [I've tried to make these commands as simple as possible by presuming the gnu utilities of debian rather than trying to make them work with other versions of the programs.] [1] A secure logger runs no services, offers no open ports and is connected to the system it is watching by a dedicated network port. Basically, there is no access except physical access. Now it's up to you to provide the physical security. http://www.tldp.org/HOWTO/Security-HOWTO/secure-prep.html#logs [2] FHS, Filesystem Hierarchy Standard http://www.pathname.com/fhs/ is the document which explains what goes where and why in a unix filesystem. -- System Information: Debian Release: 3.1 APT prefers unstable APT policy: (500, 'unstable') Architecture: i386 (i686) Kernel: Linux 2.6.10-1-686 Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) Versions of packages debian-reference depends on: ii debian-reference-en 1.08-2 Debian system administration guide -- no debconf information -- .''`. /\/'`\ [EMAIL PROTECTED] : :' : .::/:::::.. . irc://fslc.usu.edu/#cira `. `' ) .//::(:###( )::.._/^ gps:41°45'N 111°49'W `- ..:@://" ,|) _/. gpg:1024D/A7AAF777 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]