Jesse Norell writes:

Hello,
---- Original Message ----
From: Aaron Stone <[email protected]>
To: [email protected]
Subject: Re: [Dbmail-dev] RE: unique_id discussion/problem
Sent: Thu, 12 Jun 2003 07:21:42 -0700 (PDT)
In fact, I would highly recommend that a database is used. I envision a
table that has a row for each server in the cluster and a "uuid prefix" or
something to the like. Synchronization information might also be stored in
this table, such as the IP address of each server as it links up with a
row in the database and the timestamp of when it last attached.
Naturally this table will have to be replicated, and so it should not have
an auto_increment column, but something else more unique. Hostnames or IP
addresses are an obvious answer, if not a good one ;-)

  I don't think ip addresses would be unique enough - some cluster
implimentations have multiple machines with the same address (eg. via
load-balancing hardware switches).  Nor hostname (eg. we have multiple
machines for mail.kci.net - while they do have unique hostnames also,
there's no reason they would necessarily have to).  The mac addr
seems like the best almost-always-unique identifier that's readily
available cross-platform.

Why mac addr the entropy is constantly growing, there is nothing more unique than a generated id, and there are tons of generators out there to do so, however you can alwayes use specific sequences which are in a way predictable where you basically know what exactly you're going to get, synchronizing is a different matter which can be done on the fly, let say if you have an N machine which joins an A cluster where the machine identifies itself and waits for a specific id from which it derivates the sequence factor. With these words I assume that you're aware of such things like negotiation algorithms and so.. what i personally use: (except postfix and dbmail itself, with pgsql), the begining was easy, let pgsql handle the sequence generations where _ANY_ one who have access to the database can alter the sequence generation on the fly, in other words the app will be able to re-assign new sequence to the different servers by simply altering the sequence factor which is contained in the sequence table itself. Being aware of the algorithm which is used for the generation it can simply calculate what would be the factor for the next server, and this can be totally automated and in a way unique. In my eyes a unique email addrs and login ids is a different case, where the things get more complicated, but since in my approach I'd prefer to escape from the collisions which are thereof produced by some not-finished-mad-mah async replication processes. I'd search for more complex and sofisticated solution. However the above also solves the problem if any of those machines have to work on its own due to link failure or whatever.. we wont get a bloody collisions since each machine is already using a unique factor for this generation, the IDs itself doesnt matter the factor is the one that should be unique and it would be a huge advantage if it's predictable by any of the servers.

Guys, if I'm being annoying or I'm not writing on the right topic pretty please let me know, i dont want to be boring and stuff, but again if you have any agruments against this solution spread them across the list before jumping into something like UUIDs, not that i have something against it, just email is so atomic that it'd not need such a complicated solution.
For clustering, here how my stuff work:
Postfix + PgSQL Patch
DBmail + some dumb connection checks. + PgSQL
PgSQL itself is used with PgReplicator. I use pgsql sequences to generate the ids (which was the first approach with dbmail) I chose PgSQL because sequences are highly granulated and it's easy to control them. Each machine in the cluster has it's own ClusterID, aslo with a kinda HeartBeat monitor it's aware how many servers are there and what are their IDs, basically reading from a conf file where the primary source for those settings is a table inside PgSQL, this file is just a redundant option if somehow PgSQL on this machine fails to respond. Basically both, database and files are updated at the same time.
I use them in the following order:
mx1: RR(A records) dmx1 and dmx2
mx2: RR(A records) dmx2 and dmx3
so sos also when i install a dbmail system on a new cluster server, it generates the PgSQL scheme on the fly, being aware what is the cluster ID which was negotiated using the HB monitor, also a huge role is played by the PgReplicator which gives me the ability _NOT_ to replicate sequence and other tables like postfix aliases (which in fact are totally useless, but somehow have to tell postfix to shut up with the annoying msg), for the case this setup is not a free mailserver but a dedicated corporate use, so I dont have users coming around and registering. For now I havent seen much problems, not to say any. One thing is for sure, after the crash I had with MySQL and I'm so staying away from it, as my CTO's says MySqueel :)

I'm not sure what the "non-volatile" storage is needed for in your
proposal beyond what I see as a unique prefix for each dbmail in the
cluster as it writes to a replicated database server...

  Saved state info is basically for rollbacks in time (eg. machine
reboots) and to make multiple uuids generated w/in the same clock
tick be unique (because they're based largely upon time).

here you mean fail-over support which is supposed to be handled by the replication process, or in dbmail itself for maximum portability? or I'm assuming the wrong?
cheers,
-lou

Reply via email to