[ 
https://issues.apache.org/jira/browse/SUREFIRE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537253#comment-15537253
 ] 

James Taylor commented on SUREFIRE-1287:
----------------------------------------

Thanks for the info, [~tibor17]. That'd be great if you could add more logging. 
Let us know and we can start using a snapshot build to get to the bottom of it.

FWIW, we had this error occur (There was a timeout or other error in the fork) 
much more frequently before we added the {{Runtime.getRuntime().halt(0)}} call. 
This is called from a shutdown hook so will be called when the JVM exits. We 
found that often the HBase mini-cluster server on which we rely was hanging on 
attempts to let it shutdown gracefully (and we really don't need any of the 
shutdown actions to be performed by HBase when we're running our tests).

A couple of brainstorming ideas:
- Our assumption is that this error is caused by a test that's hanging. Would 
it be possible for surefire to track tests that haven't completed yet and 
before exiting the forked JVM, print out a message about which tests haven't 
completed yet?
- If that's not possible, then how about a new method in RunListener that's 
called before the forked JVM is exited? Then we could track which tests haven't 
completed yet ourselves in our own RunListener. I suppose we could do the same 
if we keep some kind of static state in our RunListener, but that gets kind of 
ugly.

We're actually having to resort to a python script that looks at the output 
logging to infer which tests haven't completed - kind of a brittle solution, 
though.

> Improve logging to understand why test run failed and report the right failed 
> category
> --------------------------------------------------------------------------------------
>
>                 Key: SUREFIRE-1287
>                 URL: https://issues.apache.org/jira/browse/SUREFIRE-1287
>             Project: Maven Surefire
>          Issue Type: Bug
>          Components: Maven Surefire Plugin
>    Affects Versions: 2.19.1
>            Reporter: Samarth Jain
>
> As part of our automated jenkins builds that run after every checkin, we have 
> been seeing a lot of these failures:
> Failed to execute goal 
> org.apache.maven.plugins:maven-failsafe-plugin:2.19.1:verify 
> (ParallelStatsEnabledTest) on project phoenix-core: There was a timeout or 
> other error in the fork
> Sample run:
> https://builds.apache.org/job/Phoenix-master/1420/console
> Unfortunately that bit of error information doesn't really help. It would be 
> good to know why exactly the fork timed out or failed. What we do know is 
> that some of the tests in the Junit category ParallelStatsDisabledTest failed 
> to complete. However, failsafe incorrectly reports the failed category as the 
> first category that ran. In this case it happened to be 
> ParallelStatsEnabledTest. Also to note is the fact that failsafe kicks off 
> next category run even before all the tests in the current category have 
> finished. I am not sure if that is by design or a bug. 
> FYI, [~jamestaylor].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to