On Thu, 2007-09-27 at 09:19 +0200, Arno Lehmann wrote:
> Hi,
>
> 27.09.2007 01:17,, Ross Boylan wrote::
> > I've been having really slow backups (13 hours) when I backup a large
> > mail spool. I've attached a run report. There are about 1.4M files
> > with a compressed size of 4G. I get much better throughput (e.g.,
> > 2,000KB/s vs 86KB/s for this job!) with other jobs.
>
> 2MB/s is still not especially fast for a backup to disk, I think. So
> your storage disk might also be a factor here.
>
> > First, does it sound as if something is wrong? I suspect the number of
> > files is the key thing, and the mail spool has lots of little files
> > (it's used by Cyrus). Is this just life when you have lots of little
> > files?
> >
> > Second, how can I figure out what the problem is? I do have some
> > suspicions, but first some basics:
> > ------------------------------------------------
> > everything is running on the same box
> > 3GHz P4 with one SATA drive as the main drive and 4 older drives, one of
> > which is the backup target.
> > No noticeable CPU load or disk activity during the backup. I was
> > compressing, but that doesn't show up noticeably for CPU use.
>
> How much memory, and how is the memory usage during backups?
2G of RAM. I'll have to watch it to determine how much is in use.
>
> > Debian GNU/Linux 2.6.18 with postgresql 8.1, bacula 2.2.13.
> > Disk is managed by evms, using LVM.
> > The partion being backed up is ext3, and the backup is going to disk (a
> > different physical disk, IDE) using Reiser.
>
> That's definitely a good thing.
>
> > I am not using snapshotting because that feature is broken right now
> > (nothiing to do with bacula). I shut down the cyrus server during the
> > backup (desspite some errors in the log around my attempted shutdown, it
> > seemed to have worked).
> >
> > My suspicion is that the TCP/IP transactions are all getting delayed
> > (maybe to batch for sending) in a way that usually isn't noticeable, but
> > is noticeable when doing lots of quick exchanges locally.
>
> I don't know anything about issues with TCP delays, and I know Bacula
> installations running smoothly on all sorts of hardware and different
> OSes.
>
> I rather suspect the catalog to be the bottle-neck.
>
> Verifying this might be as easy as running vmstat while the job is
> backed up and seeing if there is lots of iowait happening - this does
> not necessarily show as hard disk activity.
Would tcp induced delays also show up as iowait?
>
> Are your database and the mail spool on the same disk? This might
> explain the slowness you encounter.
Yes.
>
> In this case, I'd suggest to upgrade to Bacula 2.2.4. For two reasons,
> actually: There is a serious bug that will hit you one day, and which
> is fixed in the current version. Second, the new batch inserts feature
> would gain lots of speed if the database throughput really is the
> bottle neck for you.
I see 2.2.4 is in Debian unstable, so I should be able to pull it in.
That would be great if it speeds things up.
>
> > Not only are
> > my bacula components using TCP (I think), but I'm communicating with
> > postgres by TCP (I couldn't get authentication working properly with
> > unix domain sockets).
> >
> > While populating the cyrus server I also encountered very slow
> > transaction speeds. I think the TCP problem was the cause, though I
> > don't have definite confirmation. I ran multiple jobs in parallel to
> > populate the cyrus server to get around the slowness of the individual
> > parts (I think that at least rules out things like db contention or disk
> > contention as culprits in that case).
>
> As I don't know about the TCP delay I can't comment on this...
>
> > Unfortunately, AFAIK the tcp delay is not tuneable on Linux; it is with
> > BSD.
> >
> > Here are some relevant parts of bacula-dir.conf:
> >
> > JobDefs {
> > Name = "CornDefaults"
> > Type = Backup
> > Level = Incremental
> > Client = corn-fd
> > Storage = File2Storage
> > Messages = Standard
> > Pool = Default
> > Full Backup Pool = Full
> > Differential Backup Pool = Differential
> > Incremental Backup Pool = Incremental
> > Priority = 10
> > Write Bootstrap = "/usr/local/var/spool/bacula/%n.bsr"
> > }
> >
> > ######## Cyrus
> > ## really this needs more care: use snapshot, dump db to ascii
>
> As far as I know, it's sufficient to dump cyrus' database. Given that
> dump and a backup of your mail files, a correct cyrus database can be
> easily regenerated. Snapshots would be a good thing, perhaps, but
> you'd still have to explicitly dump the database as there is no
> guarantee that the disk files of the database are always in a
> consistent state.
cyrus recommends the ascii dump to guard against version changes that
would render the binary unusable.
http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/Backup has more.
You're right: snapshots alone will not assure integrity.
.....
>
> I'm really unsure about TCP problems, but the situation more or less
> looks like the catalog backend would be your problem. Could you try to
> have the catalog db on another machine?
I've only got the one for now.
Thanks.
Ross
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-users