Good idea, we need better tools (echo systems) to manage/monitor Geode resources. In Geode many times the work is handed to other low-level threads (messaging) or new threads/runnables; it will be nice to have some mechanism to associate main work-thread to low level thread; that will give better indication on who is waiting on whom.
-Anil On Fri, Feb 23, 2018 at 3:08 PM, Barry Oglesby <bogle...@pivotal.io> wrote: > A lot of the Geode thread pools are defined in ClusterDistributionManager. > Most of these use custom ThreadPoolExecutors like: > > SerialQueuedExecutorWithDMStats > PooledExecutorWithDMStats > FunctionExecutionPooledExecutor > > These classes all extend ThreadPoolExecutor and override beforeExecute and > afterExecute. These methods are currently used by helper classes to update > the stats before and after a thread executes. Potentially these same > methods could be used to add and remove a thread from a monitor. For > example, there could be a FunctionExecutionThreadMonitor that is created as > part of the FunctionExecutionPooledExecutor whose job it would be to > monitor FunctionExecution threads. The beforeExecute method would add the > thread to the monitor; the afterExecute would remove the thread from the > monitor. > > I would be mindful about the performance impact of adding these monitors, > though. > > > Thanks, > Barry Oglesby > > > On Wed, Feb 21, 2018 at 11:41 AM, Gregory Vortman < > gregory.vort...@amdocs.com> wrote: > > > That's the point exactly to have a single very thin and generic mechanism > > to cover all threads/threads pool. Nothing is specific in this solution. > > Regards > > > > > > -----Original Message----- > > From: Jason Huynh [jhu...@pivotal.io] > > Received: Wednesday, 21 Feb 2018, 20:54 > > To: dev@geode.apache.org [dev@geode.apache.org] > > CC: u...@geode.apache.org [u...@geode.apache.org] > > Subject: Re: [Proposal] Thread monitoring mechanism > > > > I am assuming this would be for all thread/thread pools and not specific > > to Function threads. I wonder what the impact would be for put/get > > operations or are we going to target specific operations. > > > > > > > > On Tue, Feb 20, 2018 at 1:04 AM Gregory Vortman < > > gregory.vort...@amdocs.com<mailto:gregory.vort...@amdocs.com>> wrote: > > Hello team, > > One of the most severe issues hitting our real time application is thread > > stuck for multiple reasons, such as long lasting locks, deadlocks, > threads > > which wait for reply forever in case of packet drop issue etc... > > Such kind of stuck are under Radar of the existing system health check > > methods. > > In mission critical applications, this will be resulted as an immediate > > outage. > > > > As a short we are implementing kind of internal watch dog mechanism for > > stuck detector: > > There is a registration object > > Function executor having start/end hooks to > > register/unregister the thread via the registration object > > Customized Monitoring scheduled thread is spawned on startup. The thread > > to wake up every N seconds, to scan the registration map and to detect > > unregistered threads for a long time (configurable). > > Once such threads has been detected, process stack is taken and thread > > stack statistic metric is provided. > > > > This helps us to monitor, detect and take fast decision about the action > > which should be taken - usually it is member bounce decision (consistency > > issue is possible, in our case it is better than deny of service). > > The above solution is not touching GEODE core code, but implemented in > > boundaries of customized code only. > > > > I would like to raise a proposal to introduce a long term generic thread > > monitoring mechanism, to detect threads which are stuck for any reason. > > To maintain a monitoring object having a start/end methods to be invoked > > similarly to FunctionStats.startFunctionExecution and FunctionStats. > > endFunctionExecution. > > > > Your feedback would be appreciated > > > > Thank you for cooperation. > > Best regards! > > > > Gregory Vortman > > > > This message and the information contained herein is proprietary and > > confidential and subject to the Amdocs policy statement, > > > > you may review at https://www.amdocs.com/about/email-disclaimer < > > https://www.amdocs.com/about/email-disclaimer> > > This message and the information contained herein is proprietary and > > confidential and subject to the Amdocs policy statement, > > > > you may review at https://www.amdocs.com/about/email-disclaimer < > > https://www.amdocs.com/about/email-disclaimer> > > >