[ 
https://issues.apache.org/jira/browse/GEODE-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489931#comment-16489931
 ] 

Bruce Schuchardt commented on GEODE-5253:
-----------------------------------------

My assessment was off on this ticket.  The shared-buffer problem that I 
described was caused by some refactoring I did.  The original issue, however, 
still exists as it happened in a CI regression run.

> PDX Object corrupted in remove(K,V) or put(K,V,V) operations
> ------------------------------------------------------------
>
>                 Key: GEODE-5253
>                 URL: https://issues.apache.org/jira/browse/GEODE-5253
>             Project: Geode
>          Issue Type: Improvement
>          Components: serialization
>            Reporter: Bruce Schuchardt
>            Priority: Major
>
> A regression test ran into corruption in the expectedValue argument of 
> remove(K,V) and put(K,V,V) operations when readPdxSerialized was enabled in 
> clients and servers.  Here's an example:
> {noformat}
> bridgegemfire5_28694/system.log: [error 2018/05/24 11:55:13.360 PDT 
> bridgegemfire5_trout_28694 <PartitionedRegion Message Processor3> tid=0x92] 
> Caught Exception
> org.apache.geode.pdx.PdxSerializationException: Exception deserializing a PDX 
> field
>       at 
> org.apache.geode.pdx.internal.PdxInputStream.readObject(PdxInputStream.java:250)
>       at 
> org.apache.geode.pdx.internal.PdxInputStream.readObject(PdxInputStream.java:93)
>       at 
> org.apache.geode.pdx.internal.PdxReaderImpl.readObject(PdxReaderImpl.java:333)
>       at 
> org.apache.geode.pdx.internal.PdxInstanceImpl.readObject(PdxInstanceImpl.java:560)
>       at 
> org.apache.geode.pdx.internal.PdxInstanceImpl.equals(PdxInstanceImpl.java:408)
>       at 
> org.apache.geode.internal.cache.entries.AbstractRegionEntry.checkPdxEquals(AbstractRegionEntry.java:1163)
>       at 
> org.apache.geode.internal.cache.entries.AbstractRegionEntry.checkEquals(AbstractRegionEntry.java:1030)
>       at 
> org.apache.geode.internal.cache.entries.AbstractRegionEntry.checkExpectedOldValue(AbstractRegionEntry.java:955)
>       at 
> org.apache.geode.internal.cache.entries.AbstractRegionEntry.destroy(AbstractRegionEntry.java:829)
>       at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.destroyEntry(RegionMapDestroy.java:723)
>       at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:387)
>       at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:238)
>       at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:149)
>       at 
> org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:1035)
>       at 
> org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6544)
>       at 
> org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6518)
>       at 
> org.apache.geode.internal.cache.BucketRegion.basicDestroy(BucketRegion.java:1194)
>       at 
> org.apache.geode.internal.cache.PartitionedRegionDataStore.destroyLocally(PartitionedRegionDataStore.java:1330)
>       at 
> org.apache.geode.internal.cache.PartitionedRegionDataView.destroyOnRemote(PartitionedRegionDataView.java:107)
>       at 
> org.apache.geode.internal.cache.partitioned.DestroyMessage.operateOnPartitionedRegion(DestroyMessage.java:268)
>       at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:334)
>       at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:378)
>       at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:444)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1121)
>       at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
>       at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$8$1.run(ClusterDistributionManager.java:945)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Unknown header byte: 116
>       at 
> org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:3100)
>       at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2978)
>       at 
> org.apache.geode.pdx.internal.PdxInputStream.readObject(PdxInputStream.java:248)
>       ... 28 more
> {noformat}
> I was able to reproduce this and found that the bytes in the PdxInstance were 
> corrupted.  Sometimes the length of the underlying byte buffer (which holds 
> the serialized state of the object) was wrong, sometimes other bytes had been 
> copied over the state of the PdxInstance leading to "bad header byte" errors, 
> BufferUnderflowExceptions, etc.
> I tracked this down to the PDX implementation having borrowed heavily from 
> the TCPConduit buffer management classes.  If an object is deserialized into 
> PdxInstance form in a P2P reader thread, the object retains a reference to 
> the area in the P2P buffer that holds its serialized state.  If the object is 
> then handed off to another thread, such as an executor thread, the object 
> still points back to the P2P buffer.  If the P2P reader thread then goes on 
> to read another message it will overwrite the buffer and corrupt the state of 
> the PdxInstance.
> This is more likely to happen with conserve-sockets=true since a thread-owned 
> connection will typically only handle one message at a time and will directly 
> execute the method in the P2P reader thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to