* Kern Sibbald schrieb am 23.02.08 um 12:40 Uhr:
> Hello,
Hi Kern,
>
> As you know, current job scheduling has a few deficiencies, particular if for
> some reason your backups get blocked (a bad tape driver or operator
> intervention required), which can lead to a big pile of duplicate jobs being
> scheduled.
>
> We have previously discussed ways of fixing this, with some really good ideas.
>
> I am now ready to take a stab at implementing it, and would like to present
> the current design and let some of you help in the design process. I am
> currently pretty busy with my own project and helping with two major projects
> that are making very nice progress, so I would appreciate some input.
>
> My current idea is to create a new "DuplicateJobs" resource and a new
> Duplicate Jobs directive which would point to the duplicate jobs resource.
> The reason for the resource is that there are just too many different
> variations that it would require a lot of new directives, and it seems a
> shame to add them to every Job.
How about creating a sub-resource in Job that could be put into
JobDefs then too?
Or maybe better a sub-resource of the Schedule resource? (As this is
about scheduling...)
Creating a new "major" resource seems to make it too complicated for what it
provides IMO. -> Keep it simple.
>
> My current design calls for a Duplicate Jobs resource that looks something
> like the following:
>
> DuplicateJobs {
> Name = "xxx"
>
> Allow = yes|no (no = default)
>
> AllowHigherLevel = yes|no (no)
>
> AllowLowerLevel = yes|no (no)
>
> AllowSameLevel = yes|no
Maybe Allow* could be simplified as
AllowLevel = "equal|higher|lower" or just "=|>|<"
where one or more of the keywords may be specified.
>
> Cancel = Running | New (no)
Maybe better "CancelJob" or "JobToCancel"
>
> CancelledStatus = Fail | Skip (fail)
CancelledJobStatus ?
>
> Job Proximity = <time-interval> (0)
Maybe better:
Allow Job Overlap = <time-interval>
or
Minimum Duplicate Time = <time-interval>
> The first "Allow" directive is probably not needed, but it does make it more
> complete. If this directive is set to yes, all the other directives would be
> ignored, which would be the same as today and with no Duplicate Jobs
> directive in the Job resource.
>
> The AllowXXX directives are to try to define what job will be allowed to
> continue when there is one job running or waiting and a new one arrives.
> For example AllowHigherLevel = yes, would mean to allow the higher level job
> to continue.
>
> The Cancel directive specifies which job to cancel (the new job or the job
> already there. I think there is probably a logic conflict between this
> directive and the AllowXXX directives, but I have not thought this through
> carefully enough.
>
> The CancelledStatus is an attempt to tell Bacula to either fail one of the
> two
> jobs or to Skip it, which means to kill it but without a lot of noise. Some
> options I could think of here that are not yet clearly specified are:
>
> Do not kill a running job in favor of a newly scheduled job.
> Do not print any messages about cancelling a job (I don't particularly
> like this idea).
> Do not record any cancelled job in the catalog
> ...
>
> Finally Job Proximity is to allow a bit of overlap. For example, if a job
> has
> been running 20 minutes or ran 20 minutes ago, you might want to not apply
> the rules.
I think a job that ran (=finished?) 20 Minutes ago should never be
handled like a duplicate job. Because it is not. For this a "Min Time Between
Jobs"
directive would we better IMO (maybe in the Schedule Resource?)
>
> As you can see, there is a lot of room for clarification of what should be
> done, and also a need for a bit more functionality ... -- in other words a
> bit more design is needed before beginning the implementation.
>
> Comments?
As stated above I think it *might* make sense to create new
directoves for Jobs or Schedules instead of a whole new resource. For things
that you
do *not* want to put in every Job{} again and again there is
JobDefs{} then..
It could look like this:
Job/JobDef or Schedule {
Duplicate Jobs {
AllowLevel = "equal|higher|lower" or just "=|>|<"
Cancel Job = Running | New
Cancelled Status = Fail | Skip
Minimum Duplicate Time = <time-interval>
}
}
Comments?
-Marc
--
8AAC 5F46 83B4 DB70 8317 3723 296C 6CCA 35A6 4134
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel