[Dbmail-dev] Re: unique_id discussion/problem

Lou Kamenov Fri, 13 Jun 2003 00:49:14 +0200 (CEST)

Jesse Norell writes:

Hello,

---- Original Message ----
From: Aaron Stone <[email protected]>
To: [email protected]
Subject: Re: [Dbmail-dev] RE: unique_id discussion/problem

Sent: Thu, 12 Jun 2003 07:21:42 -0700 (PDT)

In fact, I would highly recommend that a database is used. I envision a
table that has a row for each server in the cluster and a "uuid prefix" or
something to the like. Synchronization information might also be stored in
this table, such as the IP address of each server as it links up with a

row in the database and the timestamp of when it last attached.

Naturally this table will have to be replicated, and so it should not have
an auto_increment column, but something else more unique. Hostnames or IP
addresses are an obvious answer, if not a good one ;-)


  I don't think ip addresses would be unique enough - some cluster
implimentations have multiple machines with the same address (eg. via
load-balancing hardware switches).  Nor hostname (eg. we have multiple
machines for mail.kci.net - while they do have unique hostnames also,
there's no reason they would necessarily have to).  The mac addr
seems like the best almost-always-unique identifier that's readily
available cross-platform.

Why mac addr the entropy is constantly growing, there is nothing more uniquethan a generated id, and there are tons of generators out there to do so,however you can alwayes use specific sequences which are in a waypredictable where you basically know what exactly you're going to get,synchronizing is a different matter which can be done on the fly, let sayif you have an N machine which joins an A cluster where the machineidentifies itself and waits for a specific id from which it derivates thesequence factor. With these words I assume that you're aware of such thingslike negotiation algorithms and so..what i personally use: (except postfix and dbmail itself, with pgsql), thebegining was easy, let pgsql handle the sequence generations where _ANY_ onewho have access to the database can alter the sequence generation on thefly, in other words the app will be able to re-assign new sequence to thedifferent servers by simply altering the sequence factor which is containedin the sequence table itself.Being aware of the algorithm which is used for the generation it can simplycalculate what would be the factor for the next server, and this can betotally automated and in a way unique.In my eyes a unique email addrs and login ids is a different case, where thethings get more complicated, but since in my approach I'd prefer to escapefrom the collisions which are thereof produced by some not-finished-mad-mahasync replication processes. I'd search for more complex and sofisticatedsolution.However the above also solves the problem if any of those machines have towork on its own due to link failure or whatever.. we wont get a bloodycollisions since each machine is already using a unique factor for thisgeneration, the IDs itself doesnt matter the factor is the one that shouldbe unique and it would be a huge advantage if it's predictable by any of theservers.

Guys, if I'm being annoying or I'm not writing on the right topic prettyplease let me know, i dont want to be boring and stuff, but again if youhave any agruments against this solution spread them across the list beforejumping into something like UUIDs, not that i have something against it,just email is so atomic that it'd not need such a complicated solution.

For clustering, here how my stuff work:
Postfix + PgSQL Patch
DBmail + some dumb connection checks. + PgSQL

PgSQL itself is used with PgReplicator.I use pgsql sequences to generate the ids (which was the first approach withdbmail) I chose PgSQL because sequences are highly granulated and it's easyto control them. Each machine in the cluster has it's own ClusterID, aslowith a kinda HeartBeat monitor it's aware how many servers are there andwhat are their IDs, basically reading from a conf file where the primarysource for those settings is a table inside PgSQL, this file is just aredundant option if somehow PgSQL on this machine fails to respond.Basically both, database and files are updated at the same time.

I use them in the following order:
mx1: RR(A records) dmx1 and dmx2
mx2: RR(A records) dmx2 and dmx3

so sosalso when i install a dbmail system on a new cluster server, it generatesthe PgSQL scheme on the fly, being aware what is the cluster ID which wasnegotiated using the HB monitor, also a huge role is played by thePgReplicator which gives me the ability _NOT_ to replicate sequence andother tables like postfix aliases (which in fact are totally useless, butsomehow have to tell postfix to shut up with the annoying msg), for the casethis setup is not a free mailserver but a dedicated corporate use, so I donthave users coming around and registering.For now I havent seen much problems, not to say any.One thing is for sure, after the crash I had with MySQL and I'm so stayingaway from it, as my CTO's says MySqueel :)

I'm not sure what the "non-volatile" storage is needed for in your
proposal beyond what I see as a unique prefix for each dbmail in the
cluster as it writes to a replicated database server...


  Saved state info is basically for rollbacks in time (eg. machine
reboots) and to make multiple uuids generated w/in the same clock
tick be unique (because they're based largely upon time).

here you mean fail-over support which is supposed to be handled by thereplication process, or in dbmail itself for maximum portability? or I'massuming the wrong?

cheers,

-lou

[Dbmail-dev] Re: unique_id discussion/problem

Reply via email to