Hello,
I have now added all the Feature Requests that I have received to the projects
file, or at least I think I have. If you submitted a project, and it wasn't
rejected, and it isn't in the the file (below), please let me know.
I'll allow submissions until early next week, then will send an email
requesting you to vote. Last year we had a good participation in the (yea,
it took a lot of work to count the vote!), so this year, I hope there is even
more. Please wait until I send you voting instructions next week ...
As I explained in a recent email, voting is a way that you can help determine
the priority given to particular projects. Of course, a high priority
doesn't automatically mean a project will be implemented in the next release
-- it requires someone to program it, but I generally focus at least 80% of
my programming effort in the high priorty projects ...
As also mentioned previously, I will be signing up for fewer projects and
probably choosing ones that appeal to me more than in the past -- this will
give me more time to ensure that the increasing volume of contributed code is
well integrated into Bacula.
For the moment, I have chosen to work the following projects:
Item 1: Implement a Migration job type that will move the job
data from one device to another.
Item 3: Implement a Bacula GUI/management tool using Python
and Qt.
Item 5: Implement more Python events in Bacula.
Item 12: Implement red/black binary tree routines.
I have put #3 on hold because Lucas is making good progress on the same
project, but for GTK+ rather than Qt, which is OK with me.
Item 5 is rather small and on going, I will get to it over time ...
Item 12: the red/black class methods are already implemented. Integrating
them into the Bacula restore in-memory tree routines is a different story.
I'm not 100% sure I will complete this ...
Item 1 is a big project. Normally, I wouldn't be working on it at this point
because I'm a bit tired of the Storage daemon after the multiple autochanger
project. However, Riege Software International GmbH has made a *very*
generous pledge of a 2,500 Euro contribution to the Bacula project to be made
in three stages for the completion of this project. I should be beginning
the project sometime next week, and hope to have it in beta testing by April
2006 and in production by June 2006.
Below, I include the latest projects file.
Best regards,
Kern
Projects:
Bacula Projects Roadmap
24 November 2005
Below, you will find more information on future projects:
Item 1: Implement a Migration job type that will move the job
data from one device to another.
Origin: Sponsored by Riege Sofware International GmbH. Contact:
Daniel Holtkamp <holtkamp at riege dot com>
Date: 28 October 2005
Status: Partially coded in 1.37 -- much more to do. Assigned to
Kern.
What: The ability to copy, move, or archive data that is on a
device to another device is very important.
Why: An ISP might want to backup to disk, but after 30 days
migrate the data to tape backup and delete it from
disk. Bacula should be able to handle this
automatically. It needs to know what was put where,
and when, and what to migrate -- it is a bit like
retention periods. Doing so would allow space to be
freed up for current backups while maintaining older
data on tape drives.
Notes: Riege Software have asked for the following migration
triggers:
Age of Job
Highwater mark (stopped by Lowwater mark?)
Notes: Migration could be additionally triggered by:
Number of Jobs
Number of Volumes
Item 2: Implement extraction of Win32 BackupWrite data.
Origin: Thorsten Engel <thorsten.engel at matrix-computer dot com>
Date: 28 October 2005
Status: Assigned to Thorsten. Implemented in current CVS
What: This provides the Bacula File daemon with code that
can pick apart the stream output that Microsoft writes
for BackupWrite data, and thus the data can be read
and restored on non-Win32 machines.
Why: BackupWrite data is the portable=no option in Win32
FileSets, and in previous Baculas, this data could
only be extracted using a Win32 FD. With this new code,
the Windows data can be extracted and restored on
any OS.
Item 3: Implement a Bacula GUI/management tool using Python
and Qt.
Origin: Kern
Date: 28 October 2005
Status:
What: Implement a Bacula console, and management tools
using Python and Qt.
Why: Don't we already have a wxWidgets GUI? Yes, but
it is written in C++ and changes to the user interface
must be hand tailored using C++ code. By developing
the user interface using Qt designer, the interface
can be very easily updated and most of the new Python
code will be automatically created. The user interface
changes become very simple, and only the new features
must be implement. In addition, the code will be in
Python, which will give many more users easy (or easier)
access to making additions or modifications.
Item 4: Implement a Python interface to the Bacula catalog.
Date: 28 October 2005
Origin: Kern
Status:
What: Implement an interface for Python scripts to access
the catalog through Bacula.
Why: This will permit users to customize Bacula through
Python scripts.
Item 5: Implement more Python events in Bacula.
Date: 28 October 2005
Origin:
Status:
What: Allow Python scripts to be called at more places
within Bacula and provide additional access to Bacula
internal variables.
Why: This will permit users to customize Bacula through
Python scripts.
Notes: Recycle event
Scratch pool event
NeedVolume event
MediaFull event
Item 6: Implement Base jobs.
Date: 28 October 2005
Origin: Kern
Status:
What: A base job is sort of like a Full save except that you
will want the FileSet to contain only files that are
unlikely to change in the future (i.e. a snapshot of
most of your system after installing it). After the
base job has been run, when you are doing a Full save,
you specify one or more Base jobs to be used. All
files that have been backed up in the Base job/jobs but
not modified will then be excluded from the backup.
During a restore, the Base jobs will be automatically
pulled in where necessary.
Why: This is something none of the competition does, as far as
we know (except perhpas BackupPC, which is a Perl program that
saves to disk only). It is big win for the user, it
makes Bacula stand out as offering a unique
optimization that immediately saves time and money.
Basically, imagine that you have 100 nearly identical
Windows or Linux machine containing the OS and user
files. Now for the OS part, a Base job will be backed
up once, and rather than making 100 copies of the OS,
there will be only one. If one or more of the systems
have some files updated, no problem, they will be
automatically restored.
Notes: Huge savings in tape usage even for a single machine.
Will require more resources because the DIR must send
FD a list of files/attribs, and the FD must search the
list and compare it for each file to be saved.
Item 7: Add Plug-ins to the FileSet Include statements.
Date: 28 October 2005
Origin:
Status: Partially coded in 1.37 -- much more to do.
What: Allow users to specify wild-card and/or regular
expressions to be matched in both the Include and
Exclude directives in a FileSet. At the same time,
allow users to define plug-ins to be called (based on
regular expression/wild-card matching).
Why: This would give the users the ultimate ability to control
how files are backed up/restored. A user could write a
plug-in knows how to backup his Oracle database without
stopping/starting it, for example.
Item 8: Implement huge exclude list support using hashing.
Date: 28 October 2005
Origin: Kern
Status:
What: Allow users to specify very large exclude list (currently
more than about 1000 files is too many).
Why: This would give the users the ability to exclude all
files that are loaded with the OS (e.g. using rpms
or debs). If the user can restore the base OS from
CDs, there is no need to backup all those files. A
complete restore would be to restore the base OS, then
do a Bacula restore. By excluding the base OS files, the
backup set will be *much* smaller.
Item 9: Implement data encryption (as opposed to communications
encryption)
Date: 28 October 2005
Origin: Sponsored by Landon and 13 contributors to EFF.
Status: Landon Fuller is currently implementing this.
What: Currently the data that is stored on the Volume is not
encrypted. For confidentiality, encryption of data at
the File daemon level is essential.
Data encryption encrypts the data in the File daemon and
decrypts the data in the File daemon during a restore.
Why: Large sites require this.
Item 10: Permit multiple Media Types in an Autochanger
Origin:
Status:
What: Modify the Storage daemon so that multiple Media Types
can be specified in an autochanger. This would be somewhat
of a simplistic implementation in that each drive would
still be allowed to have only one Media Type. However,
the Storage daemon will ensure that only a drive with
the Media Type that matches what the Director specifies
is chosen.
Why: This will permit user with several different drive types
to make full use of their autochangers.
Item 11: Allow two different autochanger definitions that refer
to the same autochanger.
Date: 28 October 2005
Origin: Kern
Status:
What: Currently, the autochanger script is locked based on
the autochanger. That is, if multiple drives are being
simultaneously used, the Storage daemon ensures that only
one drive at a time can access the mtx-changer script.
This change would base the locking on the control device,
rather than the autochanger. It would then permit two autochanger
definitions for the same autochanger, but with different
drives. Logically, the autochanger could then be "partitioned"
for different jobs, clients, or class of jobs, and if the locking
is based on the control device (e.g. /dev/sg0) the mtx-changer
script will be locked appropriately.
Why: This will permit users to partition autochangers for specific
use. It would also permit implementation of multiple Media
Types with no changes to the Storage daemon.
Item 12: Implement red/black binary tree routines.
Date: 28 October 2005
Origin: Kern
Status:
What: Implement a red/black binary tree class. This could
then replace the current binary insert/search routines
used in the restore in memory tree. This could significantly
speed up the creation of the in memory restore tree.
Why: Performance enhancement.
Item 13: Let Bacula log tape usage and handle drive cleaning cycles.
Date: November 11, 2005
Origin: Arno Lehmann <al at its-lehmann dot de>
Status:
What: Make Bacula manage tape life cycle information and drive
cleaning cycles.
Why: Both parts of this project are important when operating backups.
We need to know which tapes need replacement, and we need to
make sure the drives are cleaned when necessary. While many
tape libraries and even autoloaders can handle all this
automatically, support by Bacula can be helpful for smaller
(older) libraries and single drives. Also, checking drive
status during operation can prevent some failures (as I had to
learn the hard way...)
Notes: First, Bacula could (and even does, to some limited extent)
record tape and drive usage. For tapes, the number of mounts,
the amount of data, and the time the tape has actually been
running could be recorded. Data fields for Read and Write time
and Nmber of mounts already exist in the catalog (I'm not sure
if VolBytes is the sum of all bytes ever written to that volume
by Bacula). This information can be important when determining
which media to replace. For the tape drives known to Bacula,
similar information is interesting to determine the device
status and expected life time: Time it's been Reading and
Writing, number of tape Loads / Unloads / Errors. This
information is not yet recorded as far as I know.
The next step would be implementing drive cleaning setup.
Bacula already has knowledge about cleaning tapes. Once it has
some information about cleaning cycles (measured in drive run
time, number of tapes used, or calender days, for example) it
can automatically execute tape cleaning (with an autochanger,
obviously) or ask for operator assistence loading a cleaning
tape.
The next step would be to implement TAPEALERT checks not only
when changing tapes and only sending he information to the
administrator, but rather checking after each tape error,
checking on a regular basis (for example after each tape file),
and also before unloading and after loading a new tape. Then,
depending on the drives TAPEALERT state and the know drive
cleaning state Bacula could automatically schedule later
cleaning, clean immediately, or inform the operator.
Implementing this would perhaps require another catalog change
and perhaps major changes in SD code and the DIR-SD protocoll,
so I'd only consider this worth implementing if it would
actually be used or even needed by many people.
Item 14: Merging of multiple backups into a single one. (Also called
Synthetic
Backup or Consolidation).
Origin: Marc Cousin and Eric Bollengier
Date: 15 November 2005
Status: Depends on first implementing project Item 1 (Migration).
What: A merged backup is a backup made without connecting to the Client.
It would be a Merge of existing backups into a single backup.
In effect, it is like a restore but to the backup medium.
For instance, say that last sunday we made a full backup. Then
all week long, we created incremental backups, in order to do
them fast. Now comes sunday again, and we need another full.
The merged backup makes it possible to do instead an incremental
backup (during the night for instance), and then create a merged
backup during the day, by using the full and incrementals from
the week. The merged backup will be exactly like a full made
sunday night on the tape, but the production interruption on the
Client will be minimal, as the Client will only have to send
incrementals.
In fact, if it's done correctly, you could merge all the
Incrementals into single Incremental, or all the Incrementals
and the last Differential into a new Differential, or the Full,
last differential and all the Incrementals into a new Full
backup. And there is no need to involve the Client.
Why: The benefit is that :
- the Client just does an incremental ;
- the merged backup on tape is just as a single full backup,
and can be restored very fast.
This is also a way of reducing the backup data since the old
data can then be pruned (or not) from the catalog, possibly
allowing older volumes to be recycled
Item 15: Automatic disabling of devices
Date: 2005-11-11
Origin: Peter Eriksson <peter at ifm.liu dot se>
Status:
What: After a configurable amount of fatal errors with a tape drive
Bacula should automatically disable further use of a certain
tape drive. There should also be "disable"/"enable" commands in
the "bconsole" tool.
Why: On a multi-drive jukebox there is a possibility of tape drives
going bad during large backups (needing a cleaning tape run,
tapes getting stuck). It would be advantageous if Bacula would
automatically disable further use of a problematic tape drive
after a configurable amount of errors has occured.
An example: I have a multi-drive jukebox (6 drives, 380+ slots)
where tapes occasionally get stuck inside the drive. Bacula will
notice that the "mtx-changer" command will fail and then fail
any backup jobs trying to use that drive. However, it will still
keep on trying to run new jobs using that drive and fail -
forever, and thus failing lots and lots of jobs... Since we have
many drives Bacula could have just automatically disabled
further use of that drive and used one of the other ones
instead.
Item 16: Directive/mode to backup only file changes, not entire file
Date: 11 November 2005
Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
Marek Bajon <mbajon at bimsplus dot com dot pl>
Status: RFC
What: Currently when a file changes, the entire file will be backed up in
the next incremental or full backup. To save space on the tapes
it would be nice to have a mode whereby only the changes to the
file would be backed up when it is changed.
Why: This would save lots of space when backing up large files such as
logs, mbox files, Outlook PST files and the like.
Notes: This would require the usage of disk-based volumes as comparing
files would not be feasible using a tape drive.
Item 17: Quick release of FD-SD connection
Origin: Frank Volf (frank at deze dot org)
Date: 17 november 2005
Status:
What: In the bacula implementation a backup is finished after all data
and attributes are succesfully written to storage. When using a
tape backup it is very annoying that a backup can take a day,
simply because the current tape (or whatever) is full and the
administrator has not put a new one in. During that time the
system cannot be taken off-line, because there is still an open
session between the storage daemon and the file daemon on the
client.
Although this is a very good strategey for making "safe backups"
This can be annoying for e.g. laptops, that must remain
connected until the bacukp is completed.
Using a new feature called "migration" it will be possible to
spool first to harddisk (using a special 'spool' migration
scheme) and then migrate the backup to tape.
There is still the problem of getting the attributes committed.
If it takes a very long time to do, with the current code, the
job has not terminated, and the File daemon is not freed up. The
Storage daemon should release the File daemon as soon as all the
file data and all the attributes have been sent to it (the SD).
Currently the SD waits until everything is on tape and all the
attributes are transmitted to the Director before signalling
completion to the FD. I don't think I would have any problem
changing this. The reason is that even if the FD reports back to
the Dir that all is OK, the job will not terminate until the SD
has done the same thing -- so in a way keeping the SD-FD link
open to the very end is not really very productive ...
Why: Makes backup of laptops much easier.
Item 18: Add support for CACHEDIR.TAG
Origin: Norbert Kiesel <nkiesel at tbdnetworks dot com>
Date: 21 November 2005
Status:
What: CACHDIR.TAG is a proposal for identifying directories which
should be ignored for archiving/backup. It works by ignoring
directory trees which have a file named CACHEDIR.TAG with a
specific content. See
http://www.brynosaurus.com/cachedir/spec.html
for details.
From Peter Eriksson:
I suggest that if this is implemented (I've also asked for this
feature some year ago) that it is made compatible with Legato
Networkers ".nsr" files where you can specify a lot of options on
how to handle files/directories (including denying further
parsing of .nsr files lower down into the directory trees). A
PDF version of the .nsr man page can be viewed at:
http://www.ifm.liu.se/~peter/nsr.pdf
Why: It's a nice alternative to "exclude" patterns for directories
which don't have regular pathnames. Also, it allows users to
control backup for themself. Implementation should be pretty
simple. GNU tar >= 1.14 or so supports it, too.
Notes: I envision this as an optional feature to a fileset
specification.
Item 19: Implement new {Client}Run{Before|After}Job feature.
Date: 26 September 2005
Origin: Phil Stracchino <phil.stracchino at speakeasy dot net>
Status:
What: Some time ago, there was a discussion of RunAfterJob and
ClientRunAfterJob, and the fact that they do not run after failed
jobs. At the time, there was a suggestion to add a
RunAfterFailedJob directive (and, presumably, a matching
ClientRunAfterFailedJob directive), but to my knowledge these
were never implemented.
An alternate way of approaching the problem has just occurred to
me. Suppose the RunBeforeJob and RunAfterJob directives were
expanded in a manner something like this example:
RunBeforeJob {
Command = "/opt/bacula/etc/checkhost %c"
RunsOnClient = No
RunsAtJobLevels = All # All, Full, Diff, Inc
AbortJobOnError = Yes
}
RunBeforeJob {
Command = c:/bacula/systemstate.bat
RunsOnClient = yes
RunsAtJobLevels = All # All, Full, Diff, Inc
AbortJobOnError = No
}
RunAfterJob {
Command = c:/bacula/deletestatefile.bat
RunsOnClient = Yes
RunsAtJobLevels = All # All, Full, Diff, Inc
RunsOnSuccess = Yes
RunsOnFailure = Yes
}
RunAfterJob {
Command = c:/bacula/somethingelse.bat
RunsOnClient = Yes
RunsAtJobLevels = All
RunsOnSuccess = No
RunsOnFailure = Yes
}
RunAfterJob {
Command = "/opt/bacula/etc/checkhost -v %c"
RunsOnClient = No
RunsAtJobLevels = All
RunsOnSuccess = No
RunsOnFailure = Yes
}
Why: It would be a significant change to the structure of the
directives, but allows for a lot more flexibility, including
RunAfter commands that will run regardless of whether the job
succeeds, or RunBefore tasks that still allow the job to run even
if that specific RunBefore fails.
Notes: By Kern: I would prefer to have a single new Resource called
RunScript. More notes from Phil:
RunBeforeJob = yes|no
RunAfterJob = yes|no
RunsAtJobLevels = All|Full|Diff|Inc
The AbortJobOnError, RunsOnSuccess and RunsOnFailure directives
could be optional, and possibly RunsWhen as well.
AbortJobOnError would be ignored unless RunsWhen was set to Before
(or RunsBefore Job set to Yes), and would default to Yes if
omitted. If AbortJobOnError was set to No, failure of the script
would still generate a warning.
RunsOnSuccess would be ignored unless RunsWhen was set to After
(or RunsBeforeJob set to No), and default to Yes.
RunsOnFailure would be ignored unless RunsWhen was set to After,
and default to No.
Allow having the before/after status on the script command
line so that the same script can be used both before/after.
David Boyes.
Item 20: "Maximum Rewrite (Recycle) Times" for a tape
Date: 8 November 2005
Origin: Adam Thornton <athornton at sinenomine dot net>
Status:
What: The ability to use a Volume for at most N re-write times, marking
it unavailable after that.
Why: I was working with a customer this morning who mentioned that it
would be useful to automatically age out tapes that had been
rewritten enough times that the media lifespan was becoming
questionable (his old backup system supported saying "rewrite this
tape 50 times at most." Neither Maximum Volume Jobs nor Volume Use
Duration quite does this, because you can have multiple jobs per
volume, and it's not entirely guaranteed that time and number of
write cycles map linearly (although they probably do in actual
usage), and also because "Used" volumes can be recycled, and the
status we want here is something like "full and now unwriteable."
I haven't looked but I suspect this would require a change in the
database format, to keep track of how many cycles a volume has
been through, and maybe an additional Status type.
Item 21: Allow FD to initiate a backup
Origin: Frank Volf (frank at deze dot org)
Date: 17 november 2005
Status:
What: Provide some means, possibly by a restricted console that
allows a FD to initiate a backup, and that uses the connection
established by the FD to the Director for the backup so that
a Director that is firewalled can do the backup.
Why: Makes backup of laptops much easier.
============= Empty Feature Request form ===========
Item n: One line summary ...
Date: Date submitted
Origin: Name and email of originator.
Status:
What: More detailed explanation ...
Why: Why it is important ...
Notes: Additional notes or features (omit if not used)
============== End Feature Request form ==============
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Bacula-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-users