Bill Burcham created GEODE-9498:
-----------------------------------
Summary: testRecoverAfterConflict() test gets NPE in updateEntry()
because cache was closed by DiskStoreImpl.handleAccessException()
Key: GEODE-9498
URL: https://issues.apache.org/jira/browse/GEODE-9498
Project: Geode
Issue Type: Bug
Components: tests
Affects Versions: 1.15.0
Reporter: Bill Burcham
In this testresult: https://hydradb.hdb.gemfire-ci.info/hdb/testresult/11201994
we see:
{code:java}
PersistentRecoveryOrderDUnitTest > testRecoverAfterConflict FAILED
org.apache.geode.test.dunit.RMIException: While invoking
org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest$$Lambda$468/913921843.run
in VM 0 running on Host
heavy-lifter-bcc07c55-cc73-5e2a-b7db-b1a2f447cfc1.c.apachegeode-ci.internal
with 4 VMs
at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
at
org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.testRecoverAfterConflict(PersistentRecoveryOrderDUnitTest.java:1328)
Caused by:
java.lang.NullPointerException
at
org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.updateEntry(PersistentRecoveryOrderDUnitTest.java:1395)
at
org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.lambda$testRecoverAfterConflict$bb17a952$5(PersistentRecoveryOrderDUnitTest.java:1331)
{code}
Looking at the test stdout from the test artifacts
http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0397/test-artifacts/1628631532/distributedtestfiles-openjdk8-1.15.0-build.0397.tgz
we see this happened right before that exception:
{code:java}
[vm0] [info 2021/08/10 20:43:07.167 UTC <Idle OplogCompactor1> tid=0x4f3]
Recovered values for disk store
PersistentRecoveryOrderDUnitTest_testRecoverAfterConflictRegion with unique id
6b322dee-7839-4ff3-a1ec-77e4a8fef7e9
[vm0] [info 2021/08/10 20:43:07.170 UTC <Disk store exception handler>
tid=0x4f4] GemFireCache[id = 1294377668; isClosing = true; isShutDownAll =
false; created = Tue Aug 10 20:43:06 UTC 2021; server = false; copyOnRead =
false; lockLease = 120; lockTimeout = 60]: Now closing.
[vm0] [info 2021/08/10 20:43:07.192 UTC <Disk store exception handler>
tid=0x4f4] Reinitializing JarDeploymentService with new working directory: null
[vm0] [info 2021/08/10 20:43:07.205 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] No locator(s) found with cluster configuration service
[vm0] [info 2021/08/10 20:43:07.216 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Reinitializing JarDeploymentService with new working directory:
/home/geode/geode/geode-core/build/distributedTest/test-worker-000915/dunit/vm0
[vm0] [info 2021/08/10 20:43:07.401 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Initialized cache service
org.apache.geode.management.internal.cli.remote.OnlineCommandProcessor
[vm0] [info 2021/08/10 20:43:07.402 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Initialized cache service
org.apache.geode.cache.query.internal.QueryConfigurationServiceImpl
[vm0] [info 2021/08/10 20:43:07.402 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Enabled InternalHttpService on port 7070
[vm0] [info 2021/08/10 20:43:07.402 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Initialized cache service
org.apache.geode.internal.cache.http.service.InternalHttpService
[vm0] [info 2021/08/10 20:43:07.404 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Initializing region _monitoringRegion_10.0.0.140<v93>52668
[vm0] [info 2021/08/10 20:43:07.405 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Initialization of region _monitoringRegion_10.0.0.140<v93>52668
completed
[vm0] [info 2021/08/10 20:43:07.410 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Loading previously deployed jars
[vm0] [info 2021/08/10 20:43:07.411 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Initializing region PdxTypes
[vm0] [info 2021/08/10 20:43:07.412 UTC <RMI TCP Connection(1)-10.0.0.140>
tid=0x21] Initialization of region PdxTypes completed
{code}
For some reason DiskStoreImpl.handleAccessException() was called. That code
spins up a "Disk store exception handler" to close the cache. The cache was
closed before the test got to PersistentRecoveryOrderDUnitTest.updateEntry()
and that caused the NPE.
The test succeeds when I run it locally a few times in the IDE. I haven't found
the root cause of this problem.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)