Re: Back-porting BZ57338 SSO + Cluster improvements

Mark Thomas Wed, 31 Dec 2014 09:55:13 -0800

On 24/12/2014 08:18, Keiichi Fujino wrote:
> 2014-12-19 21:40 GMT+09:00 Mark Thomas <ma...@apache.org>:
> 
>> On 17/12/2014 16:57, Mark Thomas wrote:
>>> On 17/12/2014 09:58, Keiichi Fujino wrote:
>>
>>>> [A] I think that the process of updating SSO entry needs to be notify
>>>> cluster.
>>>
>>> I'll take a look at this. I didn't see any issues in my simple testing
>>> but I may well have missed this scenario.
>>
>> My testing did indeed miss this. I have partially fixed this with one
>> more commit to come.
>>
>>>> [B] The mapSendOptions and terminateOnStartFailure attributes do not
>> need
>>>> to be configurable?
>>>
>>> Fair point. I'll add that.
>>
>> Fixed.
>>
>>>> [C] Is deregister(String ssoId, Session session) method required?
>>>> I do not know where it will be called.
>>>
>>> I'll look into it.
>>
>> Removed.
>>
>>>> [D] If my understanding is correct,
>>>> Session objects are associated with SingleSignOnEntry of primary node
>>>> and SingleSignOn is registered as a SessionListener.
>>>> I think this is no problem.
>>>> However, when the primary node goes down, because SingleSignOnEntry of
>>>> non-primary node is
>>>> not associated with the session object, SingleSignOn can not receive a
>>>> SessionEvent.
>>>> As a result, SingleSignOnEntry does not delete because the deregister
>>>> method is no longer called.
>>>> This might be a potential memory leak.
>>>
>>> My testing definitely didn't cover this. I'll add some logging and
>>> extend my tests.
>>
>> Yes this was an issue. I have a fix for this on the way. See below.
>>
>>>> At least, the new node that was promoted to the primary node needｓ to
>>>> associate the session object again.
>>>> For example, the association of session might be able to in
>>>> objectMadePrimary method.
>>>> However, because there is a concern of [E], is not called
>> objectMadePrimary
>>>> method.
>>>>
>>>> [E] Although I think this is another potential problem,
>>>> ReplicatedMap sends a COPY message to non-primary node.
>>>> Node that received the COPY message does not save the entry as a
>>>> backup(MapEntry.backup=false).
>>>> As a result, MapEntry.isPrimary() always returns true.
>>>> I do not know yet whether this is a problem,
>>>> but, at least,  processing of entry  relocate in memberDisappeared seem
>> to
>>>> not work properly.
>>>
>>> Since every MapEntry should be replicated in full to every node (I went
>>> with ReplicatedMap rather than LazyReplicatedMap) I don't think this is
>>> an issue but I'll look in to this as well.
>>
>> It was an issue. It gets more complicated when you consider that this
>> needs to work with both the Delta and Backup Managers.
>>
>> I looked into making stickiness based on the SSO cookie but that had a
>> number of issues:
>> - The SSO cookie doesn't exist until there is an authentication
>> - The SSO cookie is only a cookie - there is no support for URL encoding
>> - Current load-balancer configs would need to change to track the SSO
>>   cookie
>> All of these are solvable but the solution was starting to look messy so
>> I looked elsewhere.
>>
>> I next considered switching to LazyReplicatedMap. This solved that lack
>> of objectMadePrimary() calls but created a whole other set of problems
>> mostly because different contexts in the SSO would be stuck to different
>> nodes so there was no node that made sense as the SSO primary.
>> Stickiness based on the SSO cookie would solve this but as described
>> above, that gets messy.
>>
>> The solution I opted for was to make the SSO SessionListener
>> Serializable and use that to perform the necessary deregistration. It
>> effectively pushes the elements of the reverse Map into the session. For
>> the clustered case, I added support for replicating listeners (if marked
>> with the right interface) as part of session replication.
>>
>> The end result is something that isn't too complicated and works with
>> any of the current Cluster Managers and should work with custom ones as
>> well. The only downside is that there are nearly always more attempts to
>> deregister sessions than are required as the listener will fire for each
>> node in the cluster. The up side of this is that it is pretty much
>> impossible for an edge case to slip through where the session isn't
>> removed from SSO.
>>
>>
> ClusterSingleSignOn#associate and ClusterSingleSignOn#update invokes the
> AbstractReplicatedMap#replicate(ssoId, true).
> In fact, AbstractReplicatedMap#replicate(ssoId, true) sends not a MSG_COPY
> message but a MSG_BACKUP message.
> Is this a problem?


I don't believe so. I did various tests and the only issues were the
multiple attempts to de-register sessions already mentioned.

> The current implementation of AbstractReplicatedMap seems to not handle
> COPY nodes explicitly.
> In some methods (get, replicate, messageReceived, memberDisappeared, etc.),
> it may be better that a COPY node is handled explicitly.
> Or override these methods (some or all) in LazyReplicatedMap and
> ReplicatedMap.

I agree with you that some further work is required to make these Maps
behave as expected in the general case. However, I am content that what
we have at the moment is 'good enough' for ClusterSSO so I am not
planning any further improvements in this area.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Re: Back-porting BZ57338 SSO + Cluster improvements

Reply via email to