Re: Test failures on Windows with insufficient memory for the JRE while running distributed tests

Dale Emery Wed, 27 Oct 2021 11:57:39 -0700

> *Do the Gfsh distributed tests on Windows leave behind more artifacts on
> the harddrive than other test targets?*


On Linux, the artifact file for a full distributed test run is ~750mb.

On Windows, the artifact file for just the gfsh distributed tests is ~1gb.

> *Are we running the Gfsh distributed tests in parallel (which might
> exacerbate harddrive swapping or memory consumption)?*

On Linux, the full distributed test suite executes as many as 24 test classes 
in parallel (each in its own test JVM).

On Windows, the gfsh distributed tests do not currently execute in parallel.

I don’t know the answers to the other questions.

Dale

From: Alberto Gomez <alberto.go...@est.tech>
Date: Wednesday, October 27, 2021 at 10:21 AM
To: dev@geode.apache.org <dev@geode.apache.org>
Subject: Re: Test failures on Windows with insufficient memory for the JRE 
while running distributed tests
Thanks, Kirk.

Any expert on the OS images and pipeline could jump in to answer Kirk's 
questions and help?

Thanks,

Alberto
________________________________
From: Kirk Lund <kl...@apache.org>
Sent: Tuesday, October 26, 2021 7:26 PM
To: dev@geode.apache.org <dev@geode.apache.org>
Subject: Re: Test failures on Windows with insufficient memory for the JRE 
while running distributed tests

PS: I should also mention that the *windows-gfsh-distributed* test target
is only run on Windows (never on Linux). It might be useful to try getting
windows-gfsh-distributed running on LInux to see if it hits the same issue
on that OS. This would also require some help from a pipeline expert.

On Tue, Oct 26, 2021 at 10:22 AM Kirk Lund <kl...@apache.org> wrote:

> Hi Alberto,
>
> 32 kb is a very small amount of memory, so I don't think it's related to
> Java Heap. Based on what little I've read today, I think a failure in
> ChunkPool::allocate is probably related to either *running out of swap
> space or running out of address space in a 32 bit JVM*. Since the
> failures are OS specific, I would suspect the machine image we use for
> Windows to be involved.
>
> I also notice that this ChunkPool::allocate failure is only occurring for
> the Gfsh distributed tests which is the only job run on Windows that uses
> Gradle support for *JUnit Categories*. The Gradle target is
> distributedTest which we have configured with "*forkEvery 1*" which
> causes every test class to launch in a new JVM. Gradle implements JUnit
> 4 Category filtering by launching every test class to check the Categories
> and then either executes the tests or terminates without running any
> depending on the Categories.
>
> Some things I would check (or ask others about):
>
> *Is the harddrive space much smaller than what's available to the JVM(s)
> on Linux?*
>
> *Do the Gfsh distributed tests on Windows leave behind more artifacts on
> the harddrive than other test targets?*
>
> *Is it possible that the tests are using a 32-bit JVM on Windows? Or maybe
> the tests are spawning Gfsh process(es) using a 32-bit JVM instead of
> 64-bit?*
>
> *Are we running the Gfsh distributed tests in parallel (which might
> exacerbate harddrive swapping or memory consumption)?*
>
> Unfortunately, I don't know what most of the options in
> jinja.variables.yml are about. I think it would be best to get help from an
> expert in the OS images and pipeline details.
>
> Cheers,
> Kirk
>
> On Tue, Oct 26, 2021 at 12:59 AM Alberto Gomez <alberto.go...@est.tech>
> wrote:
>
>> Hi,
>>
>> I am having issues with insufficient memory for the Java Runtime
>> Environment when running some tests on the CI under Windows from the
>> following PR :
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7006&amp;data=04%7C01%7Cdemery%40vmware.com%7C7b81184e5afb47b705f808d9996e46eb%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637709521186740352%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ut262y%2FKb9hEjnEBC9UmRyx6CUPCvrsbDF7q%2B13NQMg%3D&amp;reserved=0
>>
>> The tests never fail under Linux.
>>
>> This is the error I get for some VMs:
>>
>> [vm4] # There is insufficient memory for the Java Runtime Environment to
>> continue.
>> [vm4] # Native memory allocation (malloc) failed to allocate 32744 bytes
>> for ChunkPool::allocate
>>
>> I have reduced the amount of resources used originally by the tests but
>> still I am not able to get a clean execution.
>>
>> I do not know if it is a matter of changing the parameters for the
>> windows execution in ci/pipelines/shared/jinja.variables.yml or if there is
>> anything else to consider.
>>
>> I would appreciate if someone from the community could help me
>> troubleshoot this issue.
>>
>> Thanks in advance,
>>
>> Alberto
>>
>>
>>

Re: Test failures on Windows with insufficient memory for the JRE while running distributed tests

Reply via email to