hi Rainer,
so to tell the true tale, isn't the story...

- You got customers on 5.5 using session replication
- Your customers want to move to Tomcat 6
- You're not confident about the maturity of Tomcat 6's clustering codebase, mainly cause you haven't used it, even though it was originally developed in 2006. I would argue that the this doubt is mainly a lack of both usage and understanding - So, to mitigate this, you'd like to use the ASF as the delivery vehicle for your custom code base to allow your customers to switch to Tomcat 6

I'm ok with you doing this for those folks in sandbox, I'd probably would recommend that we not put this as an official release to Tomcat, as I believe our small group would do much better focusing the effort against the current implementation. Putting into a release means we would spend resources in the bugs that arise from the port itself.

I do have some comments inline too


Rainer Jung wrote:
Hi Filip,

Filip Hanik - Dev Lists schrieb:
Rainer Jung wrote:
<snip>

But I do not only have these very abstract concerns. There is room for improvement and I'll happily help as I did in 2004 for the TC 5.0/5.5 cluster. Examples for improvements:

- monitoring (the old MBeans are gone and there's no good alternative)
ClusterJMXHelper.java in tomcat/trunk, feel free to add more
- Java 5 dispatcher: it doesn't use a queue at the moment,
  instead it uses a thread pool.
huh? Yes, the thread pool uses a queue to queue its jobs. set threads=1, and you have a queued async sender, just like 5.5
- we need to find out, if threads are to exensive for caching messages
    in case replication gets slow or stuck
not sure what this means
- We can easily run into ordering problems if we use a separate thread
    for each message. I know there's an ordering interceptor, but
    there's a big diference between you can optionally use it or you
    have to use it to make replication correct.
use dispatcher with a single thread, it becomes automatically ordered, just like 5.5 TC session replication has never really relied on ordering, even in 5.5 you can get unordered messages since they are received using a thread pool. an ordering interceptor is really the only way you could guarantee it.

- documentation: the huge flexibility of HA/Tribes needs for better
  documentation
agreed

sounds little bit like there is a disconnect on the knowledge of what 5.5 provides vs 6.0

That's very likely. But as I wrote, my main concern is of a much more general nature and contained in what you abbreviated with "<snip>".

Although I think that a detailed technical discussion is not the right way to determine the usefulness of OACC, some comments:

- monitoring: your reference to trunk strengthens my argument about maturity of code. Taking trunk code instead of TC 5.5 code is the maximum opposite approach.

- Java 5 dispatcher: I mostly agree. I got lost in the code. The code I thought was responsible was transport/PooledSender.java which uses a fixed pool of threads without queueing. I overlooked somehow the Executor with queue in the Java 5 dispatcher. Nevertheless there's still some discrepancy, because we added some aspects to the queue in 5.5 which are gone now:

  - lock fairness biased to the remover in order to reduce the
    likelyness of lock starvation
the LinkedBlockingQueue implements a two-lock algorith to avoid lock contention around simultaneous puts and takes.

  - taking over the whole queue by the remover instead of
    removing item by item (again less lock contention paired with
    less context switching)
yes, that's a neat idea, question is how it compares against the two-lock algorithm since TC 5.5 uses a single lock, my guess would be there is a larger risk of contention there than a two lock algorithm
  - limited size: Favor prevention of OutOfMemoryError over replication
    correctness in case we run into replication communication problems.
    Priority is always on the primary function, i.e. a working webapp,
    clustering is always a secondary function which should be as
    transparent as possible during normal operations
this is also implemented, and very accurate. default queue size is set to 64MB, and you can control what behavior to use when you reach that limit.



We could go into detail here, but I would prefer to do this in a separate discussion thread. I don't think that those examples are big problems, and maybe they are not problems at all.
I do agree that JMX support is a major uha, but the other points seem to be based on not understanding or misunderstanding the current implementations, and I'd like to address those for you

I want to stress once again my own experience, that a huge code base implementing a complex apparatus needs time to mature. Thus I think it's fair to not simply lock up happy TC 5.5 cluster users inside 5.5 and offer OACC as an intermediate step on the way to migrate to HA/Tribes.
knock yourself out :)

Filip

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to