[ https://issues.apache.org/jira/browse/GEODE-482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kirk Lund updated GEODE-482: ---------------------------- Comment: was deleted (was: A Pivotal Tracker story has been created for this Issue: https://www.pivotaltracker.com/story/show/106767128) > deserialization can hang for one minute waiting for a DataSerializer > -------------------------------------------------------------------- > > Key: GEODE-482 > URL: https://issues.apache.org/jira/browse/GEODE-482 > Project: Geode > Issue Type: Bug > Components: serialization > Reporter: Darrel Schneider > > If a JVM does not explicitly register a DataSerializer it is going to use but > instead relies and Geode to distribute the DataSerializer to it from another > member or server then a race condition exists that can cause it to wait for 1 > minute and fail to find the DataSerializer. > The work around for this is to explicitly register the DataSerializer using a > static initializer or the cache.xml serializer element. > A unit test was intermittently hitting this problem (see GEODE-376) but that > test has been changed to workaround the race in the product. > The race is in this code > com.gemstone.gemfire.internal.InternalDataSerializer.getSerializer(int): > SerializerAttributesHolder sah=idsToHolders.get(idx); > while (result == null && !timedOut && sah == null) { > Object o = idsToSerializers.putIfAbsent(idx, marker); > if (o == null) { > result = marker.getSerializer(); > If getSerializer sees a null "sah" but before it can do the > "idsToSerializers.putIfAbsent" another thread executes this code > com.gemstone.gemfire.internal.InternalDataSerializer.register(String, > boolean, SerializerAttributesHolder): > if (className == null || className.trim().equals("")) { > throw new IllegalArgumentException("Class name cannot be null or > empty."); > } > SerializerAttributesHolder oldValue = dsClassesToHolders.putIfAbsent( > className, holder); > if (oldValue != null) { > if (oldValue.getId() != 0 && holder.getId() != 0 > && oldValue.getId() != holder.getId()) { > throw new IllegalStateException(snip); > } > } > idsToHolders.putIfAbsent(holder.getId(), holder); > Object ds = idsToSerializers.get(holder.getId()); > if (ds instanceof Marker) { > synchronized (ds) { > ((Marker)ds).notifyAll(); > } > } > So this thread does not see the Marker and does not notify it. > That leaves the first thread stuck on Marker.getSerializer which blocks for 1 > minute and then returns null. > A new test needs to be written that will reliably fail for this bug. > A multi-threaded unit test that uses these two methods would be best. -- This message was sent by Atlassian JIRA (v6.3.4#6332)