Arno Lehmann a écrit :
Hi,26.02.2009 09:46, Gilles Guillotin wrote:Hi all, This is my first post on this mailing-list, representing ASPerience.Did you discuss your patch with the developers previously?
response in other mail
We made some enhancements on Bacula 2.4.0 and created a patch for this release which may be easily ported to next releases. This patch has been created in order to optimize the communications between the File daemon and the Director.I understand it increases the communication as well...
yes and no. that depends of how you use bacula
With the new features, Bacula can backup clients which change their IP like laptops.Hmm... I wonder if any new feature is really needed for that. There are ways to assign static hostnames even to machines with changing IP addresses, and there is the setip command which can well be used with a named and ACLed console connection...
yes and no. not the same feature
There are less error messages when a job is canceled because of the absence of the File Daemon. The communication between FD and DIR become bidirectional so connections are more frequent. New features for the DIR: - when the DIR starts, he tries to connect to the FD. If the connection is successful, a presence parameter in the Client ressource change to "yes". Else the presence parameter keep his value "no". - when the DIR is going to start a new job, he checks the presence parameter. If the client is present, the DIR starts the job, else he waits for him during a time specified in the Client ressource in the bacula-dir.conf (this parameter is named "WaitTimer"). He checks if the client is connected at each interval of a time (attribute "PresenceTimer" in bacula-dir.conf).So the DIR does not connect the FD blindly (as today)? That would be a big disadvantage IMO...
not for us.
I don't remember when writing this mail, but I think we corrected this to use existing port after discussion with KernIf the client never connects himself during the "WaitTimer" time, the job is marked as "JSAutomaticallyCanceled" in the Catalog. "JSAutomaticallyCanceled" is a new parameter defined in jcr.h and it means that the job is canceled because the File daemon has never been connected. - a new file has been created named fd_server.c. It allows the DIR to listen to the File Daemon connections (the default port is 9104, parameter DIRportFD in Director ressource of bacula.dir.conf).Argh... that needs to be officially assigned, and I doubt the Bacula Project will get a third IANA port today... definitely not 9104 which is already assigned to PeerWire.Also, it requires modifying firewall and tunnel settings for installations where those exist between DIR and FD. This can be a major inconvenience for existing installations on an upgrade.Thus I would suggest you let this procedure happen on the existing ports (which would obviously be the DIR one...)
it works. either a director resource gots a conventional configuration and this doesn't change way of FDs either a director resource gots the new parameter and this is ok (each director ressource has its parameter)The parameter MaxClientsPresence defined in Director ressource in bacula-dir.conf defines how many File Daemons the DIR can listen simultaneously. - Authentifications functions are also implemented in authenticate.c in src/dird and src/filed. New features for the FD: - the FD must know the address of the Director which is stocked in the Director ressource in bacula-fd.conf.What about clients that are backed up by more than one DIR?
Also, he knows on which port he is able to contact the DIR (default 9104). - when the FD start, he tries to connect to the DIR. If the connection is successful, a presence parameter in the Client ressource of the Director daemon changes to "yes". Else the presence parameter keep his default value "no". For the authentification he uses the existing password between the File Daemon and the Director. The File Daemon gives his new address to the DIR so if the client is a laptop, jobs can be run with any IP.The latter can be achieved with bconsole and setip.
manually. we make it automatically
- when the File Daemon stops, he warns the DIR he is going away. After this warning, presence_parameter = 0 : the DIR knows the client is absent. This feature doesn't work on Windows system.That would need to be fixed.
yes but if you don't want to use the feature, you are not impacted :)
patched bacula binaries work with existing configurations. features are not used when parameters are not setPerhaps the FD not finished in the same way as it stops on Linux. At least, on Windows, bacula does not go in the fonction "terminate_filed" in filed.c so the presence parameter keep his value at 1. ----> Perhaps there is a possible upgrade to do. For the connections at the start of the two Daemons, there is a retry_interval defined at 10 seconds (if connection fail, retry after 10 seconds) and a max_retry_time defined at 20 seconds (abandon connection after 20 seconds). Normally, the old configurations works fine even though files are patched. If there are no configuration files when we apply the patch, they are created with a new configuration (Presence parameter, PresenceTimer, WaitTimer, Address of the Director...). Else you must modify the configuration files: if the Presence parameter in Client ressource in bacula-dir.conf and the address attribute in Director ressource in bacula-fd.conf don't exist, bacula will run in it's classic behaviour.Hmm... I don't understand if you say that any existing configuration will work as expected or not. Can you clarify this?
Also, as, at startup, the DIR needs to contact all its client, have you done some performance tests? I suspect that, with several hundred clients, a huge delay might result...yes, you're right with very lots of clients. but this is not how we use bacula.
this is a feature for parallelizing calls. if there are more clients, they will be treated like in a queuingHere is an example of a new configuration: 1/ In "bacula-dir.conf"Director { Name = localhost-dir DIRport = 9101 DIRportFD = 9104 QueryFile = "/usr/bin/query.sql" WorkingDirectory ="/var/bacula/working" PidDirectory = "/var/bacula/working" Maximum Concurrent Jobs = 1 Password = "*******************" Messages = Daemon MaxClientsPresence = 20 #How many client the DIR can listen simultaneously -----------------> NEW }What's the default here, and what happens if there are more clients than allowed by this?
Client { Name = localhost-fd Address = localhost FDPort = 9102 Catalog = MyCatalog Password = "****************" File Retention =30 days Job Retention = 6 months AutoPrune = yes Presence = yes #The presence parameter exist -------------------------> NEW PresenceTimer = 15 # Maximum time to verify the client presence--------> NEW WaitTimer = 60 minutes # Maximum time to wait the client --------------> NEW # PresenceTimer and WaitTimer are defined in second by default. We can use minutes, hours, days... like with other temporal parameters in Bacula. } 2/ In "bacula-fd.conf" Director { Name = localhost-dir Address = localhost DIRport = 9104---------------------------------------------------------> NEW Password = "*****************" }Here is an explanation of a typical communication between the FD and the DIR: 1/ Starting daemons: 1.1/ DIR starts before FD (most frequent situation)DIR starts; DIR tries to connect to FD; if (FD connected) { presence_parameter = 1; } FD starts; FD tries to connect to DIR; if(DIR connected) { presence_parameter = 1; FD give his new address to DIR; } 1.2/ FD starts before DIRFD starts; FD tries to connect to DIR; if (DIR connected) { presence_parameter = 1; FD give his new address to DIR; } DIR starts; DIR tries to connect to FD; if (FD connected) { presence_parameter = 1; }1/ Starting job (Backup, Restore): DIR check FD presence; if (FD hasn't got presence_parameter) { ----> old configuration run job like old configuration; } else { ----> new configuration if (FD present) { run job; } else { while (WaitTimer isn't terminate) { check FD connection all the PresenceTimer interval; if (FD connect) { run job; } } Job mark at JSAutomaticallyCanceled; } } The patch can be downloaded at : http://docs.asperience.fr/bacula-2.4.0_ASP.patch Regards,Well, I believe I understand what you wanted to solve here, but somehow I don't see why you didn't use the existing functionality.
there is no existing functionality to solve what we made
Rerun jobs on failure, limit the time to start a job, dynamic DNS updates for mobile clients, and the setip command of bconsole should be sufficient to achieve what you need.
not at all
ever explained in august (http://www.mail-archive.com/[email protected]/msg02833.html).Can you clarify why those are *not* sufficient for you?
Cheers, Arno
begin:vcard fn;quoted-printable:Jean-S=C3=A9bastien Hederer n;quoted-printable:Hederer;Jean-S=C3=A9bastien org:ASPerience adr:CS 50743;;9, rue Alfred Kastler;Nantes Cedex 3;;44307;FRANCE email;internet:[email protected] title;quoted-printable:G=C3=A9rant tel;work:0980082541 tel;fax:0980082549 tel;cell:0669562149 note;quoted-printable:Conseil, int=C3=A9gration, h=C3=A9bergement de syst=C3=A8mes informatique= s sp=C3=A9cialis=C3=A9 en=0D=0A= solutions logiciel libre de gestion d'entreprise x-mozilla-html:FALSE url:http://www.asperience.fr version:2.1 end:vcard
smime.p7s
Description: S/MIME Cryptographic Signature
------------------------------------------------------------------------------ Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H
_______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
