[ 
https://issues.apache.org/jira/browse/CASSANDRA-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18066167#comment-18066167
 ] 

Sam Lightfoot edited comment on CASSANDRA-21220 at 3/16/26 6:22 PM:
--------------------------------------------------------------------

I had a quick look at the setup here, and _jvm-dtest_ runs on medium nodes with 
5GB memory limit, yet build.xml L1958 for _jvm-dtest_ has _Xmx8G._ I assume 
<JDK21 tests are fine due to use of G1 GC, whereas Z GC typically has a higher 
memory footprint which is likely causing the intermittent OOM.

We could try reducing Xmx to fit within the memory limit, or another option is 
to run _jvm-dtest_ on large nodes, similar to _dtest._ It feels like regardless 
of the outcome of this, Xmx should be less than the memory limit of the medium 
nodes.

_Trace_

 *Step 1* — {{build.xml}} (lines 1950-1959): jvm-dtest sets {{-Xmx8G}}
{code:xml}
<target name="test-jvm-dtest" ...>
    <jvmarg value="-Xmx8G"/>
</target>
{code}
*Step 2* — {{Jenkinsfile}} (line 197): no {{size}} specified
{code:groovy}
'jvm-dtest': [splits: 16],
{code}
*Step 3* — {{Jenkinsfile}} (lines 216-217): defaults to {{'medium'}}
{code:groovy}
if (!it.value['size']) {
    it.value.put('size', 'medium')
}
{code}
*Step 4* — {{Jenkinsfile}} (line 504): node label becomes 
{{cassandra-amd64-medium}}
{code:groovy}
def label = "cassandra-${cell.arch}-${command.size}"
{code}
*Step 5* — {{jenkins-deployment.yaml}} (lines 285-286): medium dind container 
limit
{code:yaml}
resourceRequestMemory: 3400M
resourceLimitMemory: 5G
{code}


was (Author: JIRAUSER302824):
I had a quick look at the setup here, and _jvm-dtest_ runs on medium nodes with 
5GB memory limit, yet build.xml L1958 for _jvm-dtest_ has _Xmx8G._ I assume 
<JDK21 tests are fine due to use of G1 GC, whereas Z GC typically has a higher 
memory footprint which is likely causing the intermittent OOM.

We could try reducing Xmx to fit within the memory limit, or another option is 
to run _jvm-dtest_ on large nodes, similar to _dtest._ It feels like regardless 
of the outcome of this, Xmx should be less than the memory limit of the medium 
nodes.

> Test failure: 
> org.apache.cassandra.distributed.test.SinglePartitionReadCommandTest
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21220
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21220
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/java
>            Reporter: Sam Tunnicliffe
>            Priority: Normal
>
> See: 
> https://butler.cassandra.apache.org/#/ci/upstream/workflow/Cassandra-trunk/failure/org.apache.cassandra.distributed.test/SinglePartitionReadCommandTest/testNonCompactTableWithOnlyUpdatedColumnOnOneNodeAndColumnDeletionOnTheOther
> This seems to be failing due to OOM on a semi-regular basis, on jdk21 only.
> Typical stacktrace: 
> {code}
> org.apache.cassandra.distributed.shared.ShutdownException: Uncaught 
> exceptions were thrown during test
>       at 
> org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1218)
>       at 
> org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1203)
>       at 
> org.apache.cassandra.distributed.test.SinglePartitionReadCommandTest.testNonCompactTableWithOnlyUpdatedColumnOnOneNodeAndColumnDeletionOnTheOther(SinglePartitionReadCommandTest.java:56)
>       at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:75)
>       at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:52)
>       Suppressed: java.lang.OutOfMemoryError: Java heap space
>               at 
> java.base/java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:71)
>               at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:391)
>               at 
> org.apache.cassandra.utils.memory.SlabAllocator.getRegion(SlabAllocator.java:138)
>               at 
> org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:103)
>               at 
> org.apache.cassandra.utils.memory.MemtableBufferAllocator$MemtableByteBufferCloner.allocate(MemtableBufferAllocator.java:61)
>               at 
> org.apache.cassandra.utils.memory.ByteBufferCloner.clone(ByteBufferCloner.java:100)
>               at 
> org.apache.cassandra.utils.memory.ByteBufferCloner.clone(ByteBufferCloner.java:86)
>               at 
> org.apache.cassandra.utils.memory.ByteBufferCloner.clone(ByteBufferCloner.java:47)
>               at 
> org.apache.cassandra.db.memtable.SkipListMemtable.put(SkipListMemtable.java:125)
>               at 
> org.apache.cassandra.db.memtable.Memtable.put(Memtable.java:187)
>               at 
> org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1511)
>               at 
> org.apache.cassandra.db.CassandraTableWriteHandler.write(CassandraTableWriteHandler.java:38)
>               at 
> org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:579)
>               at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:434)
>               at org.apache.cassandra.db.Mutation.apply(Mutation.java:297)
>               at org.apache.cassandra.db.Mutation.apply(Mutation.java:317)
>               at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:846)
>               at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeLocally(ModificationStatement.java:837)
>               at 
> org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:519)
>               at 
> org.apache.cassandra.db.SystemKeyspace.updateCompactionHistory(SystemKeyspace.java:729)
>               at 
> org.apache.cassandra.db.compaction.CompactionTask.updateCompactionHistory(CompactionTask.java:409)
>               at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:335)
>               at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:26)
>               at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:99)
>               at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:110)
>               at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:408)
>               at 
> org.apache.cassandra.concurrent.FutureTask$3.call(FutureTask.java:141)
>               at 
> org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>               at 
> org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>               at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
>               at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>               at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>       Suppressed: [CIRCULAR REFERENCE: java.lang.OutOfMemoryError: Java heap 
> space]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to