Bill Burcham created GEODE-9498: ----------------------------------- Summary: testRecoverAfterConflict() test gets NPE in updateEntry() because cache was closed by DiskStoreImpl.handleAccessException() Key: GEODE-9498 URL: https://issues.apache.org/jira/browse/GEODE-9498 Project: Geode Issue Type: Bug Components: tests Affects Versions: 1.15.0 Reporter: Bill Burcham
In this testresult: https://hydradb.hdb.gemfire-ci.info/hdb/testresult/11201994 we see: {code:java} PersistentRecoveryOrderDUnitTest > testRecoverAfterConflict FAILED org.apache.geode.test.dunit.RMIException: While invoking org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest$$Lambda$468/913921843.run in VM 0 running on Host heavy-lifter-bcc07c55-cc73-5e2a-b7db-b1a2f447cfc1.c.apachegeode-ci.internal with 4 VMs at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631) at org.apache.geode.test.dunit.VM.invoke(VM.java:448) at org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.testRecoverAfterConflict(PersistentRecoveryOrderDUnitTest.java:1328) Caused by: java.lang.NullPointerException at org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.updateEntry(PersistentRecoveryOrderDUnitTest.java:1395) at org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.lambda$testRecoverAfterConflict$bb17a952$5(PersistentRecoveryOrderDUnitTest.java:1331) {code} Looking at the test stdout from the test artifacts http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0397/test-artifacts/1628631532/distributedtestfiles-openjdk8-1.15.0-build.0397.tgz we see this happened right before that exception: {code:java} [vm0] [info 2021/08/10 20:43:07.167 UTC <Idle OplogCompactor1> tid=0x4f3] Recovered values for disk store PersistentRecoveryOrderDUnitTest_testRecoverAfterConflictRegion with unique id 6b322dee-7839-4ff3-a1ec-77e4a8fef7e9 [vm0] [info 2021/08/10 20:43:07.170 UTC <Disk store exception handler> tid=0x4f4] GemFireCache[id = 1294377668; isClosing = true; isShutDownAll = false; created = Tue Aug 10 20:43:06 UTC 2021; server = false; copyOnRead = false; lockLease = 120; lockTimeout = 60]: Now closing. [vm0] [info 2021/08/10 20:43:07.192 UTC <Disk store exception handler> tid=0x4f4] Reinitializing JarDeploymentService with new working directory: null [vm0] [info 2021/08/10 20:43:07.205 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] No locator(s) found with cluster configuration service [vm0] [info 2021/08/10 20:43:07.216 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Reinitializing JarDeploymentService with new working directory: /home/geode/geode/geode-core/build/distributedTest/test-worker-000915/dunit/vm0 [vm0] [info 2021/08/10 20:43:07.401 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Initialized cache service org.apache.geode.management.internal.cli.remote.OnlineCommandProcessor [vm0] [info 2021/08/10 20:43:07.402 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Initialized cache service org.apache.geode.cache.query.internal.QueryConfigurationServiceImpl [vm0] [info 2021/08/10 20:43:07.402 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Enabled InternalHttpService on port 7070 [vm0] [info 2021/08/10 20:43:07.402 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Initialized cache service org.apache.geode.internal.cache.http.service.InternalHttpService [vm0] [info 2021/08/10 20:43:07.404 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Initializing region _monitoringRegion_10.0.0.140<v93>52668 [vm0] [info 2021/08/10 20:43:07.405 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Initialization of region _monitoringRegion_10.0.0.140<v93>52668 completed [vm0] [info 2021/08/10 20:43:07.410 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Loading previously deployed jars [vm0] [info 2021/08/10 20:43:07.411 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Initializing region PdxTypes [vm0] [info 2021/08/10 20:43:07.412 UTC <RMI TCP Connection(1)-10.0.0.140> tid=0x21] Initialization of region PdxTypes completed {code} For some reason DiskStoreImpl.handleAccessException() was called. That code spins up a "Disk store exception handler" to close the cache. The cache was closed before the test got to PersistentRecoveryOrderDUnitTest.updateEntry() and that caused the NPE. The test succeeds when I run it locally a few times in the IDE. I haven't found the root cause of this problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)