Kirk Lund created GEODE-10323:
---------------------------------

             Summary: OffHeapStorageJUnitTest testCreateOffHeapStorage fails 
with AssertionError: expected:<100> but was:<1048576>
                 Key: GEODE-10323
                 URL: https://issues.apache.org/jira/browse/GEODE-10323
             Project: Geode
          Issue Type: Bug
          Components: offheap
            Reporter: Kirk Lund


{{OffHeapStorageJUnitTest testCreateOffHeapStorage}} started failing 
intermittently due to commit a350ed2 which went in as 
"[GEODE-10087|https://issues.apache.org/jira/browse/GEODE-10087]: Enhance 
off-heap fragmentation visibility. 
([#7407|https://github.com/apache/geode/pull/7407/commits/1640effc5c760afa8d9ec4c2743950bb1de0ef8f])".

The failure stack looks like:
{noformat}
OffHeapStorageJUnitTest > testCreateOffHeapStorage FAILED
    java.lang.AssertionError: expected:<100> but was:<1048576>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:647)
        at org.junit.Assert.assertEquals(Assert.java:633)
        at 
org.apache.geode.internal.offheap.OffHeapStorageJUnitTest.testCreateOffHeapStorage(OffHeapStorageJUnitTest.java:220)
{noformat}

The cause is in {{MemoryAllocatorImpl}}. A new scheduled executor 
{{updateNonRealTimeStatsExecutor}} was added near the bottom of the constructor 
which immediately gets scheduled to invoke {{freeList::updateNonRealTimeStats}} 
repeatedly at the specified frequency of {{updateOffHeapStatsFrequencyMs}} 
whenever an instance of {{MemoryAllocatorImpl}} is created.

{{freeList::updateNonRealTimeStats}} then updates {{setLargestFragment}} and 
{{setFreedChunks}}.

Scheduling or starting any sort of threads from a constructor is considered a 
bad practice and should only be done via some sort of activation method such as 
{{start()}} which can be invoked from the static factory method for 
{{MemoryAllocatorImpl}}. The unit tests would then continue to construct 
instances via a constructor that does NOT invoke {{start()}}.

{{OffHeapStorageJUnitTest testCreateOffHeapStorage}}, and possibly other tests, 
is written with the assumption that no other thread or component can or will be 
updating the {{OffHeapStats}}. Hence it is then safe for the test to 
setLargestFragment and then assert that it has the value passed in:
{noformat}
219:      stats.setLargestFragment(100);
220:      assertEquals(100, stats.getLargestFragment());
{noformat}

Line 220 is the source of the assertion failure because 
{{updateNonRealTimeStatsExecutor}} is racing with the JUnit thread and changes 
the value of {{setLargestFragment}} immediately after the test sets the value 
to 100, but before the test invokes the assertion.

Looking back at the [PR test 
results|https://cio.hdb.gemfire-ci.info/commits/pr/7407/junit?days=90] we can 
see that stress-new-test failed 5 times with this exact failure before being 
merged to develop:
* 
http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646140652/
* 
http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646154134/
 (x2)
* 
http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646156997/
* 
http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646170346/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to