Arno Lehmann a écrit :
Hi,

26.02.2009 09:46, Gilles Guillotin wrote:
Hi all,

This is my first post on this mailing-list, representing
ASPerience.

Did you discuss your patch with the developers previously?
response in other mail
We made some enhancements on Bacula 2.4.0 and created a patch for
this release which may be easily ported to next releases.

This patch has been created in order to optimize the communications
between the File daemon and the Director.

I understand it increases the communication as well...

yes and no. that depends of how you use bacula
With the new features, Bacula can backup clients which change their
IP like laptops.

Hmm... I wonder if any new feature is really needed for that. There are ways to assign static hostnames even to machines with changing IP addresses, and there is the setip command which can well be used with a named and ACLed console connection...
yes and no. not the same feature
There are less error messages when a job is
canceled because of the absence of the File Daemon. The
communication between FD and DIR become bidirectional so
connections are more frequent.

New features for the DIR: - when the DIR starts, he tries to
connect to the FD. If the connection is successful, a presence
parameter in the Client ressource change to "yes". Else the
presence parameter keep his value "no". - when the DIR is going to
start a new job, he checks the presence parameter. If the client is
present, the DIR starts the job, else he waits for him during a
time specified in the Client ressource in the bacula-dir.conf (this
parameter is named "WaitTimer"). He checks if the client is
connected at each interval of a time (attribute "PresenceTimer" in
bacula-dir.conf).

So the DIR does not connect the FD blindly (as today)? That would be a big disadvantage IMO...
not for us.

If the client never connects himself during the
"WaitTimer" time, the job is marked as "JSAutomaticallyCanceled" in
the Catalog. "JSAutomaticallyCanceled" is a new parameter defined
in jcr.h and it means that the job is canceled because the File
daemon has never been connected. - a new file has been created
named fd_server.c. It allows the DIR to listen to the File Daemon
connections (the default port is 9104, parameter DIRportFD in
Director ressource of bacula.dir.conf).

Argh... that needs to be officially assigned, and I doubt the Bacula Project will get a third IANA port today... definitely not 9104 which is already assigned to PeerWire.

Also, it requires modifying firewall and tunnel settings for installations where those exist between DIR and FD. This can be a major inconvenience for existing installations on an upgrade.

Thus I would suggest you let this procedure happen on the existing ports (which would obviously be the DIR one...)

I don't remember when writing this mail, but I think we corrected this to use existing port after discussion with Kern

The parameter
MaxClientsPresence defined in Director ressource in bacula-dir.conf
defines how many File Daemons the DIR can listen simultaneously. -
Authentifications functions are also implemented in authenticate.c
in src/dird and src/filed.   New features for the FD: - the FD must
know the address of the Director which is stocked in the Director
ressource in bacula-fd.conf.

What about clients that are backed up by more than one DIR?
it works. either a director resource gots a conventional configuration and this doesn't change way of FDs either a director resource gots the new parameter and this is ok (each director ressource has its parameter)

Also, he knows on which port he is
able to contact the DIR (default 9104). - when the FD start, he
tries to connect to the DIR. If the connection is successful, a
presence parameter in the Client ressource of the Director daemon
changes to "yes". Else the presence parameter keep his default
value "no". For the authentification he uses the existing password
between the File Daemon and the Director. The File Daemon gives his
new address to the DIR so if the client is a laptop, jobs can be
run with any IP.

The latter can be achieved with bconsole and setip.
manually. we make it automatically

- when the File Daemon stops, he warns the DIR he
is going away. After this warning, presence_parameter = 0 : the DIR
knows the client is absent. This feature doesn't work on Windows
system.

That would need to be fixed.
yes but if you don't want to use the feature, you are not impacted :)
Perhaps the FD not finished in the same way as it stops on
Linux. At least, on Windows, bacula does not go in the fonction
"terminate_filed" in filed.c so the presence parameter keep his
value at 1. ----> Perhaps there is a possible upgrade to do.   For
the connections at the start of the two Daemons, there is a
retry_interval defined at 10 seconds (if connection fail, retry
after 10 seconds) and a max_retry_time defined at 20 seconds
(abandon connection after 20 seconds).

Normally, the old configurations works fine even though files are
patched.  If there are no configuration files when we apply the
patch, they are created with a new configuration (Presence
parameter, PresenceTimer, WaitTimer, Address of the Director...).
Else you must modify the configuration files: if the Presence
parameter in Client ressource in bacula-dir.conf and the address
attribute in Director ressource in bacula-fd.conf don't exist,
bacula will run in it's classic behaviour.

Hmm... I don't understand if you say that any existing configuration will work as expected or not. Can you clarify this?
patched bacula binaries work with existing configurations. features are not used when parameters are not set

Also, as, at startup, the DIR needs to contact all its client, have you done some performance tests? I suspect that, with several hundred clients, a huge delay might result...
yes, you're right with very lots of clients. but this is not how we use bacula.

 Here is an example of a new configuration:


1/ In "bacula-dir.conf"

Director { Name = localhost-dir DIRport = 9101 DIRportFD = 9104 QueryFile = "/usr/bin/query.sql" WorkingDirectory =
"/var/bacula/working" PidDirectory = "/var/bacula/working" Maximum
Concurrent Jobs = 1 Password = "*******************" Messages =
Daemon MaxClientsPresence = 20  #How many client the DIR can listen
simultaneously -----------------> NEW }

What's the default here, and what happens if there are more clients than allowed by this?
this is a feature for parallelizing calls. if there are more clients, they will be treated like in a queuing


Client { Name = localhost-fd Address = localhost FDPort = 9102 Catalog = MyCatalog Password = "****************" File Retention =
30 days Job Retention = 6 months AutoPrune = yes Presence = yes #
The presence parameter exist -------------------------> NEW PresenceTimer = 15 # Maximum time to verify the client presence
--------> NEW WaitTimer = 60 minutes  # Maximum time to wait the
client --------------> NEW # PresenceTimer and WaitTimer are
defined in second by default. We can use minutes, hours, days...
like with other temporal parameters in Bacula. }


2/ In "bacula-fd.conf"

Director { Name = localhost-dir Address = localhost DIRport = 9104
---------------------------------------------------------> NEW Password = "*****************" }


Here is an explanation of a typical communication between the FD
and the DIR:

1/ Starting daemons:

1.1/ DIR starts before FD (most frequent situation)

DIR starts; DIR tries to connect to FD; if (FD connected) { presence_parameter = 1; } FD starts; FD tries to connect to DIR; if
(DIR connected) { presence_parameter = 1; FD give his new address
to DIR; }

1.2/ FD starts before DIR

FD starts; FD tries to connect to DIR; if (DIR connected) { presence_parameter = 1; FD give his new address to DIR; } DIR starts; DIR tries to connect to FD; if (FD connected) { presence_parameter = 1; }


1/ Starting job (Backup, Restore):

DIR check FD presence; if (FD hasn't got presence_parameter) {
----> old configuration run job like old configuration; } else {
----> new configuration if (FD present) { run job; } else { while
(WaitTimer isn't terminate) { check FD connection all the
PresenceTimer interval; if (FD connect) { run job; } } Job mark at
JSAutomaticallyCanceled; } }

The patch can be downloaded at :
http://docs.asperience.fr/bacula-2.4.0_ASP.patch


Regards,


Well, I believe I understand what you wanted to solve here, but somehow I don't see why you didn't use the existing functionality.
there is no existing functionality to solve what we made

Rerun jobs on failure, limit the time to start a job, dynamic DNS updates for mobile clients, and the setip command of bconsole should be sufficient to achieve what you need.
not at all
Can you clarify why those are *not* sufficient for you?
ever explained in august (http://www.mail-archive.com/[email protected]/msg02833.html).

Cheers,

Arno

begin:vcard
fn;quoted-printable:Jean-S=C3=A9bastien Hederer
n;quoted-printable:Hederer;Jean-S=C3=A9bastien
org:ASPerience
adr:CS 50743;;9, rue Alfred Kastler;Nantes Cedex 3;;44307;FRANCE
email;internet:[email protected]
title;quoted-printable:G=C3=A9rant
tel;work:0980082541
tel;fax:0980082549
tel;cell:0669562149
note;quoted-printable:Conseil, int=C3=A9gration, h=C3=A9bergement de syst=C3=A8mes informatique=
	s sp=C3=A9cialis=C3=A9 en=0D=0A=
	solutions logiciel libre de gestion d'entreprise 
x-mozilla-html:FALSE
url:http://www.asperience.fr
version:2.1
end:vcard

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to