Thanks again for the information. We had increased the RAM to 3g some time ago to prevent OOMEs. More recently, I increased the RAM again to 5g for extra headroom since we had more headroom available; the problem hasn't happened since, but it hasn't been very long.
We use a more customized image based on Alpine 3.8.2. The JDK and Maven are obtained via apk. I will try upgrading failsafe (and surefire while I'm at it) sooner, and probably do some experimentation with JVMs another time (not pressing for me ATM). On Tue, Feb 26, 2019 at 12:20 PM Tibor Digana <[email protected]> wrote: > >> I'll try to enable some logging about GC pauses to see what's up > > Pls do not keep such setting after tuning the GC because this may sometime > break the interprocess communication between Maven process and surefire > process. > It's worth to list GC information in a file and not in the console logs. > This can be configured, I guess. > > >> Do you think the value is simply too low? > > GCing many objects may take some time and I remember we had a user who had > this problem a year or two ago. > We check every third NOOP (which is 3 x 10 sec) as a fix instead of every > NOP. So 30 seconds looked satisfactory. > I think you use old version 2.20 or something like that. The fixes for > docker have been done so far, so please use the latest version 3.0.0-M3. > See this page > https://maven.apache.org/surefire/maven-surefire-plugin/docker.html, we > used maven:3.5.3-jdk-8-alpine in this test. Which base image did you use? > > Cheers > Tibor > > On Tue, Feb 26, 2019 at 5:24 PM Jason Young <[email protected]> > wrote: > > > Thanks for the information. It's good to see someone understands a little > > about this. > > > > Incidentally, we have been looking at other GCs and VMs for the > application > > in production environments, so I'll look into how these affect tests as > > well. I'll try to enable some logging about GC pauses to see what's up. > > > > How would `-Xmx3g` cause long GC cycles? Do you think the value is simply > > too low? > > > > FWIW we're running the Maven build in an Alpine-based Docker container. > > > > On Sat, Feb 23, 2019 at 6:36 AM Tibor Digana <[email protected]> > > wrote: > > > > > Hi Jason, > > > > > > We spoke about this issue on our chat in ASF Slack: > > > "I think his tests have been paused for a long GC periods and timed out > > 3x > > > PING period = 30 seconds. After this period forked JVM supposed the > Maven > > > process was killed by JenkinsCI and therefore all surefire processes > are > > > killed as well and all the file handlers and memory consumptions are > > > freed." > > > > > > "But I have to say that `-Xmx3g` may cause long GC cycles, see > > > > > > > > > https://maven.apache.org/surefire/maven-surefire-plugin/examples/shutdown.html > > > " > > > > > > You are using java-1.8-openjdk. I guess you should use Shenandoah GC > > which > > > is an experimental algorithm in JVM 1.8. This would significantly > short > > > the GC cycles. > > > > > > We should of cource provide a new configuration parameter to give you a > > > chance to prolong the PING. > > > > > > Cheers > > > Tibor > > > > > > > > > -- > > > > Jason Young > > > -- Jason Young Software Engineer | PROCENTIVE [image: Phone] 715 245 8000 x7609 [image: Mobile] 706 870 3540 [image: Web] procentive.com Confidentiality Notice: This message is intended for the sole use of the individual and entity to which it is addressed, and may contain information that is privileged, confidential and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure or distribution of this email message, including any attachment, is prohibited. If you are not the intended recipient, please advise the sender by reply email and destroy all copies of the original message.
