Geode unit tests 'develop/DistributedTest' took too long to execute
Pipeline results can be found at: Concourse: https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/36
Geode unit tests completed in 'develop/UITests' with non-zero exit code
Pipeline results can be found at: Concourse: https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/UITests/builds/53
PR #1868: new ThreadMonitoring feature
Can a couple more folks please review PR #1868? My review feedback pertained only to using dependency injection instead of a singleton or static getter and Yossi has updated the code as I requested. My review doesn't really address the usefulness of the feature or the impact on features that were updated to use ThreadMonitoring. Note that there is new User API and Configuration. It would be good for people who work with Queues, Pools, Configuration, FunctionExecutor, DistributionManager (and its Executors), Client/Server communication, WAN Gateways to review this PR since it touches all of these classes: AbstractGatewaySenderEventProcessor.java AcceptorImpl.java ClusterDistributionManager.java ConcurrentParallelGatewaySenderEventProcessor.java ConcurrentSerialGatewaySenderEventProcessor.java DistributionConfig.java DistributionConfigImpl.java DistributionManager.java ExpiryTask.java FunctionExecutionPooledExecutor.java GemFireCacheImpl.java LonerDistributionManager.java OneTaskOnlyExecutor.java ParallelAsyncEventQueueImpl.java ParallelGatewaySenderEventProcessor.java ParallelGatewaySenderHelper.java ParallelGatewaySenderImpl.java PoolImpl.java PRHARedundancyProvider.java RemoteConcurrentParallelGatewaySenderEventProcessor.java RemoteConcurrentSerialGatewaySenderEventProcessor.java RemoteParallelGatewaySenderEventProcessor.java RemoteSerialGatewaySenderEventProcessor.java ScheduledThreadPoolExecutorWithKeepAlive.java SerialAsyncEventQueueImpl.java SerialGatewaySenderEventProcessor.java SerialGatewaySenderImpl.java SerialQueuedExecutorWithDMStats.java TcpServer.java This PR also introduces new user visible API and configuration options: ConfigurationProperties.java ThreadMonitoring.java (I may have left out a couple classes) https://github.com/apache/geode/pull/1868
Geode unit tests completed in 'develop/FlakyTest' with non-zero exit code
Pipeline results can be found at: Concourse: https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/FlakyTest/builds/53
Re: PR #1868: new ThreadMonitoring feature
This seems like a really bad idea. It's using the long-deprecated Thread.stop() to kill executor threads that take too long to execute. It has no idea what state the thread is in when it kills it. It could be doing disk I/O or writing to shared memory. On 6/4/18 9:50 AM, Kirk Lund wrote: Can a couple more folks please review PR #1868? My review feedback pertained only to using dependency injection instead of a singleton or static getter and Yossi has updated the code as I requested. My review doesn't really address the usefulness of the feature or the impact on features that were updated to use ThreadMonitoring. Note that there is new User API and Configuration. It would be good for people who work with Queues, Pools, Configuration, FunctionExecutor, DistributionManager (and its Executors), Client/Server communication, WAN Gateways to review this PR since it touches all of these classes: AbstractGatewaySenderEventProcessor.java AcceptorImpl.java ClusterDistributionManager.java ConcurrentParallelGatewaySenderEventProcessor.java ConcurrentSerialGatewaySenderEventProcessor.java DistributionConfig.java DistributionConfigImpl.java DistributionManager.java ExpiryTask.java FunctionExecutionPooledExecutor.java GemFireCacheImpl.java LonerDistributionManager.java OneTaskOnlyExecutor.java ParallelAsyncEventQueueImpl.java ParallelGatewaySenderEventProcessor.java ParallelGatewaySenderHelper.java ParallelGatewaySenderImpl.java PoolImpl.java PRHARedundancyProvider.java RemoteConcurrentParallelGatewaySenderEventProcessor.java RemoteConcurrentSerialGatewaySenderEventProcessor.java RemoteParallelGatewaySenderEventProcessor.java RemoteSerialGatewaySenderEventProcessor.java ScheduledThreadPoolExecutorWithKeepAlive.java SerialAsyncEventQueueImpl.java SerialGatewaySenderEventProcessor.java SerialGatewaySenderImpl.java SerialQueuedExecutorWithDMStats.java TcpServer.java This PR also introduces new user visible API and configuration options: ConfigurationProperties.java ThreadMonitoring.java (I may have left out a couple classes) https://github.com/apache/geode/pull/1868
Re: PR #1868: new ThreadMonitoring feature
This looks like a dangerous change that is putting a fixed time limit on execution of arbitrary jobs with no real check to see if threads are stuck or not. If a thread is doing some heavy lifting it could be killed with Thread.stop() in the middle of its work. Think initial image transfer, for instance. This can leave "sender" threads stuck waiting for replies until they, too, are killed. No-one is even supposed to /use/ Thread.stop(). On 6/4/18 9:50 AM, Kirk Lund wrote: Can a couple more folks please review PR #1868? My review feedback pertained only to using dependency injection instead of a singleton or static getter and Yossi has updated the code as I requested. My review doesn't really address the usefulness of the feature or the impact on features that were updated to use ThreadMonitoring. Note that there is new User API and Configuration. It would be good for people who work with Queues, Pools, Configuration, FunctionExecutor, DistributionManager (and its Executors), Client/Server communication, WAN Gateways to review this PR since it touches all of these classes: AbstractGatewaySenderEventProcessor.java AcceptorImpl.java ClusterDistributionManager.java ConcurrentParallelGatewaySenderEventProcessor.java ConcurrentSerialGatewaySenderEventProcessor.java DistributionConfig.java DistributionConfigImpl.java DistributionManager.java ExpiryTask.java FunctionExecutionPooledExecutor.java GemFireCacheImpl.java LonerDistributionManager.java OneTaskOnlyExecutor.java ParallelAsyncEventQueueImpl.java ParallelGatewaySenderEventProcessor.java ParallelGatewaySenderHelper.java ParallelGatewaySenderImpl.java PoolImpl.java PRHARedundancyProvider.java RemoteConcurrentParallelGatewaySenderEventProcessor.java RemoteConcurrentSerialGatewaySenderEventProcessor.java RemoteParallelGatewaySenderEventProcessor.java RemoteSerialGatewaySenderEventProcessor.java ScheduledThreadPoolExecutorWithKeepAlive.java SerialAsyncEventQueueImpl.java SerialGatewaySenderEventProcessor.java SerialGatewaySenderImpl.java SerialQueuedExecutorWithDMStats.java TcpServer.java This PR also introduces new user visible API and configuration options: ConfigurationProperties.java ThreadMonitoring.java (I may have left out a couple classes) https://github.com/apache/geode/pull/1868
[Spring CI] Spring Data GemFire > Nightly-ApacheGeode > #938 was SUCCESSFUL (with 2406 tests)
--- Spring Data GemFire > Nightly-ApacheGeode > #938 was successful. --- Scheduled 2408 tests in total. https://build.spring.io/browse/SGF-NAG-938/ -- This message is automatically generated by Atlassian Bamboo
Re: PR #1868: new ThreadMonitoring feature
Some great points Bruce!! I believe this whole thread Monitoring feature needs to have the scope expanded... I mean.. We would either require compensatory events to rollback ANYTHING that the thread had been doing until the point it was killed. When it was proposed, I understood it as a mechanism that would let us know if there are threads that are stuck, in real time Seems this PR has some remediation build it, which would require A LOT more thought!!! --Udo On 6/4/18 13:55, Bruce Schuchardt wrote: This looks like a dangerous change that is putting a fixed time limit on execution of arbitrary jobs with no real check to see if threads are stuck or not. If a thread is doing some heavy lifting it could be killed with Thread.stop() in the middle of its work. Think initial image transfer, for instance. This can leave "sender" threads stuck waiting for replies until they, too, are killed. No-one is even supposed to /use/ Thread.stop(). On 6/4/18 9:50 AM, Kirk Lund wrote: Can a couple more folks please review PR #1868? My review feedback pertained only to using dependency injection instead of a singleton or static getter and Yossi has updated the code as I requested. My review doesn't really address the usefulness of the feature or the impact on features that were updated to use ThreadMonitoring. Note that there is new User API and Configuration. It would be good for people who work with Queues, Pools, Configuration, FunctionExecutor, DistributionManager (and its Executors), Client/Server communication, WAN Gateways to review this PR since it touches all of these classes: AbstractGatewaySenderEventProcessor.java AcceptorImpl.java ClusterDistributionManager.java ConcurrentParallelGatewaySenderEventProcessor.java ConcurrentSerialGatewaySenderEventProcessor.java DistributionConfig.java DistributionConfigImpl.java DistributionManager.java ExpiryTask.java FunctionExecutionPooledExecutor.java GemFireCacheImpl.java LonerDistributionManager.java OneTaskOnlyExecutor.java ParallelAsyncEventQueueImpl.java ParallelGatewaySenderEventProcessor.java ParallelGatewaySenderHelper.java ParallelGatewaySenderImpl.java PoolImpl.java PRHARedundancyProvider.java RemoteConcurrentParallelGatewaySenderEventProcessor.java RemoteConcurrentSerialGatewaySenderEventProcessor.java RemoteParallelGatewaySenderEventProcessor.java RemoteSerialGatewaySenderEventProcessor.java ScheduledThreadPoolExecutorWithKeepAlive.java SerialAsyncEventQueueImpl.java SerialGatewaySenderEventProcessor.java SerialGatewaySenderImpl.java SerialQueuedExecutorWithDMStats.java TcpServer.java This PR also introduces new user visible API and configuration options: ConfigurationProperties.java ThreadMonitoring.java (I may have left out a couple classes) https://github.com/apache/geode/pull/1868
Geode unit tests completed in 'develop/UITests' with non-zero exit code
Pipeline results can be found at: Concourse: https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/UITests/builds/58
Geode unit tests completed in 'develop/DistributedTest' with non-zero exit code
Pipeline results can be found at: Concourse: https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/39