P.S. I'd also like to quiet attempts to recover workers from errors to a lower (and by default unlogged) logging level. The transition of a worker into an error state should certainly be logged, but logging every time we find it to still be in an error state seems to be excessive -- at least for a sparsely populated port bank use case.
Jess Holle wrote:
Jess Holle wrote:
Jess Holle wrote:
Mladen Turk wrote:
Jess Holle wrote:
Mladen Turk wrote:
Is there a means of achieving background-only (or nearly so) 
testing of dead workers with mod_jk?  That's what I'm looking for 
in both jk and mod_proxy_ajp connectors.  I guess I was 
hoping/assuming it was there in mod_jk from reading the docs.
There is in the mod_jk (SVN trunk).
I've been reading this code now...

The watchdog thread looks very useful. If I understand it correctly, the watchdog thread can do whatever it feels like but currently mainly calls wc_maintain, which will only do work at most every worker.maintain seconds, right?
connection_keepalive does not look like it really my bill, though.  
I'm most worried about workers in an error state and ensuring that 
they are rechecked every recover_wait_time -- but only by the 
watchdog thread and ideally via a ping/pong.  Currently 
recover_workers appears to just put workers into a recovery state 
where they'll be elligible to be tried again on a future request -- 
without checking whether the worker is actually accessible.  That's 
fine for some use cases, but explicitly what I want to avoid.
Are there any thoughts to have an option to have recover_workers() do 
a ping prior to returning a working to a non-error state?
And, yes, a watchdog thread in mod_proxy_balancer /and /a reasonable 
means of balancer invoking a ping via mod_proxy_ajp would be really 
helpful as far as mod_proxy_ajp is concerned.
Another possibly simpler alternative: we could introduce a limit as to how many workers we attempt to do (unforced) recoveries on for any given request. Any request could likely tolerate a recovery attempt or two. None should have to tolerate 6-12 recovery attempts just because of a currently sparsely populated port range.
Thoughts?

--
Jess Holle

Reply via email to