Bill Burcham created GEODE-9498:
-----------------------------------

             Summary: testRecoverAfterConflict() test gets NPE in updateEntry() 
because cache was closed by DiskStoreImpl.handleAccessException()
                 Key: GEODE-9498
                 URL: https://issues.apache.org/jira/browse/GEODE-9498
             Project: Geode
          Issue Type: Bug
          Components: tests
    Affects Versions: 1.15.0
            Reporter: Bill Burcham


In this testresult: https://hydradb.hdb.gemfire-ci.info/hdb/testresult/11201994 
we see:

{code:java}
PersistentRecoveryOrderDUnitTest > testRecoverAfterConflict FAILED
    org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest$$Lambda$468/913921843.run
 in VM 0 running on Host 
heavy-lifter-bcc07c55-cc73-5e2a-b7db-b1a2f447cfc1.c.apachegeode-ci.internal 
with 4 VMs
        at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
        at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
        at 
org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.testRecoverAfterConflict(PersistentRecoveryOrderDUnitTest.java:1328)

        Caused by:
        java.lang.NullPointerException
            at 
org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.updateEntry(PersistentRecoveryOrderDUnitTest.java:1395)
            at 
org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderDUnitTest.lambda$testRecoverAfterConflict$bb17a952$5(PersistentRecoveryOrderDUnitTest.java:1331)
{code}

Looking at the test stdout from the test artifacts 
http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0397/test-artifacts/1628631532/distributedtestfiles-openjdk8-1.15.0-build.0397.tgz
 we see this happened right before that exception:

{code:java}
[vm0] [info 2021/08/10 20:43:07.167 UTC  <Idle OplogCompactor1> tid=0x4f3] 
Recovered values for disk store 
PersistentRecoveryOrderDUnitTest_testRecoverAfterConflictRegion with unique id 
6b322dee-7839-4ff3-a1ec-77e4a8fef7e9

[vm0] [info 2021/08/10 20:43:07.170 UTC  <Disk store exception handler> 
tid=0x4f4] GemFireCache[id = 1294377668; isClosing = true; isShutDownAll = 
false; created = Tue Aug 10 20:43:06 UTC 2021; server = false; copyOnRead = 
false; lockLease = 120; lockTimeout = 60]: Now closing.

[vm0] [info 2021/08/10 20:43:07.192 UTC  <Disk store exception handler> 
tid=0x4f4] Reinitializing JarDeploymentService with new working directory: null

[vm0] [info 2021/08/10 20:43:07.205 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] No locator(s) found with cluster configuration service

[vm0] [info 2021/08/10 20:43:07.216 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Reinitializing JarDeploymentService with new working directory: 
/home/geode/geode/geode-core/build/distributedTest/test-worker-000915/dunit/vm0

[vm0] [info 2021/08/10 20:43:07.401 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Initialized cache service 
org.apache.geode.management.internal.cli.remote.OnlineCommandProcessor

[vm0] [info 2021/08/10 20:43:07.402 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Initialized cache service 
org.apache.geode.cache.query.internal.QueryConfigurationServiceImpl

[vm0] [info 2021/08/10 20:43:07.402 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Enabled InternalHttpService on port 7070

[vm0] [info 2021/08/10 20:43:07.402 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Initialized cache service 
org.apache.geode.internal.cache.http.service.InternalHttpService

[vm0] [info 2021/08/10 20:43:07.404 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Initializing region _monitoringRegion_10.0.0.140<v93>52668

[vm0] [info 2021/08/10 20:43:07.405 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Initialization of region _monitoringRegion_10.0.0.140<v93>52668 
completed

[vm0] [info 2021/08/10 20:43:07.410 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Loading previously deployed jars

[vm0] [info 2021/08/10 20:43:07.411 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Initializing region PdxTypes

[vm0] [info 2021/08/10 20:43:07.412 UTC  <RMI TCP Connection(1)-10.0.0.140> 
tid=0x21] Initialization of region PdxTypes completed
{code}

For some reason DiskStoreImpl.handleAccessException() was called. That code 
spins up a "Disk store exception handler" to close the cache. The cache was 
closed before the test got to PersistentRecoveryOrderDUnitTest.updateEntry() 
and that caused the NPE.

The test succeeds when I run it locally a few times in the IDE. I haven't found 
the root cause of this problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to