[ 
https://issues.apache.org/jira/browse/GEODE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715664#comment-16715664
 ] 

John Blum commented on GEODE-6032:
----------------------------------

[~agingade] - In terms of the items you listed for support, I agree.

Yes the system needs to support:

1) Apply and send entire data.
2) Apply and send delta.
3) Nothing to apply, send no-data.

I changed the wording/semantics of the 3rd point and reordered the items in 
terms of logical occurrence.

Using the (HTTP) Session example, first a user logs in to the Webapp and 
creates a (new) Session.  Again, by definition, the *new* Session object is 
dirty, and any "dirty" object has a delta (in this case, from nothing to 
something). Therefore, the first time, the client must send the "entire object".

Also, from our discussion, I pointed out that, even if the client tries to send 
a "delta" for a new object the first time, Geode does the *right" thing and the 
server responds to the client that it has no corresponding object for the delta 
just sent by the client.  In turn, the client immediately turns around and 
sends the entire object by calling {{DataSerializer.toData(..)}} or 
{{DataSerializable.toData(..)}}, which ever is implemented.  This occurs all 
with in the first {{Region.put(key, value)}} call when {{Delta.hasDelta()}} 
returned *true*.

After the servers know about the (Session) object, the client *only* ever needs 
to send the delta, until the (Session) object is explicitly removed, expires or 
is invalidated, in which case, we can rinse and repeat.  Try and send delta, 
otherwise send entire object which the client/server interaction already 
handles.

If there is "no change", as specified by {{Delta.hasDelta()}}, which again my 
application domain object must implement to handle deltas in my own types, then 
Geode need do nothing.  There is no possible inconsistency, no cache callbacks 
that need to be fired, not clients that need to be notified, nothing.  My 
application object just informed Geode there were no changes.

Sending the entire object is (potentially) resource intensive (wasteful) and 
introduces possible "_lost updates_".

This is not my opinion (or perspective), this is fact.



> Entire object is serialized again (and again) when Delta.hasDelta() returns 
> false and client is using PROXY Region
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-6032
>                 URL: https://issues.apache.org/jira/browse/GEODE-6032
>             Project: Geode
>          Issue Type: Bug
>          Components: client/server
>    Affects Versions: 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, 1.4.0, 1.5.0, 1.6.0, 
> 1.7.0
>            Reporter: John Blum
>            Assignee: Anilkumar Gingade
>            Priority: Major
>              Labels: SmallFeature, pull-request-available
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently, if a cache client application is...
> 1) configured with a client {{PROXY}} Region, and...
> 2) the application is also using {{DataSerialization}} to de/serialize 
> objects to/from the servers, and...
> 3) the application domain objects implement the {{org.apache.geode.Delta}} 
> interface then Apache Geode will incorrectly send the entire object (again) 
> when {{Delta.hasDelta}} returns *false*.
> It is understandable that the application domain object needs to be 
> serialized in its entirety the first time the object is sent to the server(s) 
> (or if the object is later, subsequently removed or has expired/been evicted, 
> and then needs to be re-added for whatever reason).
> However, once the server(s) know about the object, then only ever a "delta" 
> should be sent, and only when {{Delta.hasDelta()}} returns *true*.  
> Otherwise, if {{Delta.hasDelta()}} returns *false*, then most certainly, the 
> entire object should not be sent again, otherwise it is possible for the 
> application to enter a "_race condition_" scenario where the object gets 
> "overwritten", and as a result, the application can lose data (aka "_lost 
> updates_").
> If users were to change their client Region data management policy from 
> {{PROXY}} to {{CACHING_PROXY}} then this works as expected.  Apache Geode 
> will only send updates for an object it already knows about if there is 
> actually a "delta", otherwise Geode does nothing (that is, does not send the 
> entire object or any delta to the server(s) since there is technically 
> nothing to send).
> Obviously, in the {{CACHING_PROXY}} case, there is "local" state to compare 
> against, and therefore, Geode knows about the object already, in that it 
> "exists".  It can therefore assess the object to determine if it is the 
> same/unchanged, and not do anything in the case where {{Delta.hasDelta()}} 
> returns *false*, thus the "application" informing Geode there is nothing to 
> send.
> Clearly, in the {{PROXY}} case, there is no "local" state, and therefore, 
> Geode does not know whether the object (already) exists on the servers or 
> not.  So, if {{Delta.hasDelta()}} returns *false*, it is unsure whether the 
> objects exists or not and so decides just to send the entire object again, a 
> "_premature optimization_" to be sure, which now has sacrificed 
> "_correctness_", and has amplified the possible "_race conditions_" on the 
> application side.
> However, this is no different than if {{Delta.hasDelta()}} returns *true* and 
> the object is *not yet* known by the servers.  When the client sends just the 
> delta in this case, the server will send back to the client, "I don't know 
> anything about this object for which the delta needs to be applied", and 
> therefore, the client must turn around and send it the entire object anyway.
> So, in the {{PROXY}} case, it would be better if the client made a 
> determination about whether the object truly exists on the server side or 
> not, first, before arbitrarily and falsely assuming the entire object should 
> be sent again if the {{Delta.hasDelta()}} returns *false*.  The client simply 
> does not know and should "verify" before sending the object.
> Obviously, this affect will performance, but is a small price to pay (and the 
> "correct" thing to do) compared with "_lost updates_" and amplifying "_race 
> conditions_" on the client-side.
> There is also a situation where {{CACHING_PROXY}} client Regions can even 
> *fail*, and that is when {{copy-on-read}} is set to *true*.  Hopefully, this 
> is obvious why.
> To make matters worse, even the 
> [_Javadoc_|http://geode.apache.org/releases/latest/javadoc/org/apache/geode/Delta.html#hasDelta--]
>  explains and implies that only "_pending changes_" are written if they 
> exist...
> > "Returns true if this object has pending changes it can write out."
> Of course, this doc is less than clear and very ambiguous about what exactly 
> happens when ((hasDelta()) return *false*, which is "pertinent" information.  
> But, to be sure, it is certainly not consistent in behavior when different 
> data management policies are in effect, and most definitely not correct!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to