Hi George, this is a huge topic, but I'll try to answer some of your
questions inline:
On 03/16/2014 04:19 PM, George Toma wrote:
Hi all,
I'm trying to achieve the following functionality and I would like an
opinion, advice from more experienced people:
- the goal is to provide a replication system which will publish
messages to a JMS queue ( OpenMQ) based on the changes that occur in the
database tables.
* using a JMS queue will allow me loosely coupled system design (
I'm opened for suggestion here too )
E.g.
* tables - tBook ( the table which is representing a book
entity), tBookAuthor
* tables audit - tBook_aud, tBookAuthor_aud
* replication table : tReplicationMessage - this table holds the
operation type ( insert/update/delete ), the changed columns, the
table name ( e.g. tBook), and the primary key.
* My current intention is to provide a stored procedure which will
publish the messages on the JMS queue ( triggered by the
tReplicationMessage table changes). The values that change will be
taken from the audit tables.
* Google protocol buffers will be used as messages format (
faster, optimal solution)
What I don't like at this approach is the following case :
* 20 tables ( tBookFormat, tBookCategory etc), and each table with
more than 1000 changes per second will trigger too many Java
processes ( my stored procedure from tReplicationMessage table) to
be executed for each change.
Not really sure what you mean by "Java process" here. Stored procedures
execute within the same jvm and the same thread as they are called from.
That being said, 20000 tps is a high number, so you should probably do
some benchmarking before committing to a particular design.
What is the right way of implementing a
batch system in order to be feasible( batch the messages, e.g. 5
messages, and fire a single Java process to send data to the JMS
queue) ?
Ok, I think I see what you're getting at - your JMS interface is to
spawn a separate process to inject each message?
What is wrong here and what should be the best approach with Derby DB ?
I have never tried this, but my gut feeling is that it is a non-starter,
even for much lighter loads than you have in mind. Java never really
followed the UNIX philosophy of combining lots of small simple processes
in a pipeline to do the work. Frankly, in this context, the jvm is a
behemoth and trying to spawn many is rarely a good idea.
Are you sure the JMS interface does not have a java api, e.g. for use in
an appserver? Seems to me that you would want to access this api from
wihin your Derby stored procedure.
In terms of batching, you could have your trigger just insert records
into a queue-table, and then have another thread batch records from this
queue for sending. That obviously becomes a classic consumer/producer
problem with all that entails (load control, etc.).
--
Regards,
Dyre