https://bz.apache.org/bugzilla/show_bug.cgi?id=63859

--- Comment #3 from Aurelien Pernoud <aurel...@pernoud.org> ---
Hi Rainer,

Any info on usage characteristics during times this happens? High load (how
many requests per second), with or without load balancing, workers.properties
config etc.?

=> There is no "rule", it even happen with only a few users on my test
instance.
Workers.properties is the same than for my tomcat 7&8 instances, e.g. I use a
servertemplate which I apply to node and a lbtemplate too.

Here is an extract :
#Create one common template for all workers nodes
worker.servertemplate.type=ajp13
# factor of each worker is the same
worker.servertemplate.lbfactor=1
# ping_mode A is the most complete
worker.servertemplate.ping_mode=C
# socket_timeout in seconds
worker.servertemplate.socket_timeout=30
# connection_pool_timeout in seconds
worker.servertemplate.connection_pool_timeout=600
# reply_timeout in milliseconds : 10 min
worker.servertemplate.reply_timeout=600000
# recovery_options : 3 : don't retry on error after request was sent
worker.servertemplate.recovery_options=3

# Create one common template for all LB nodes
worker.balancertemplate.type=lb
# If we have 10 replies timing out in the same minute => worker goes in error
state
worker.balancertemplate.max_reply_timeouts=10

Then based on this I apply those settings to nodes / clusters.

How frequently does it happen (always, sporadicaly for NNN % of requests, in
spikes, ...)?

=> I couldn't find any "rule" sorry.... but it happened at least 10 times / day
and since I changed the cping cpong to C it doesn't show anymore in the logs
even though I have activity

Can you easily reproduce?

=> Yes, I have a test environment which is in use and if I push back
cping/cpong mode to A I'm sure the error will show up again.

Do you have root privileges, so could you sniff network traffic to the AJP
port? Are Apache and Tomcat on diferent machines? Any active components
(Firewalls, routers) in between?

=> I'm not root unfortunately, but my config is that I run 2 linux servers,
which both hosts httpd and tomcat instances (both servers runs same versions of
httpd, tomcat), and it works fine with the ajp port of tomcat 7&8.
.
There is no firewall between them (hard or soft), and the error occurs even
when the httpd is connecting to "himself" (even though I don't use "localhost"
but the hostname)

I've ran this setup for more than a year on Tomcat 7&8 (even with upgrades),
and only met the issue with 9. It failes with 9.0.20 so I tried to upgrade last
week to 9.0.26 but still here. 
Since yesterday and switching the ping_mode to "C" it stopped failing, so for
now I will go like that in production but might be good to investigate.

Let me know if I can be of any help.

FYI I tried :
- putting mod_jk in debug => way to verbose (I run 9 instances of tomcat in the
end on the same server), but at the beginning the behaviour looks correct (I
saw the ping pong working with my tomcat 9 instances every minute)
- adding debug in tomcat "AJP" : couldn't find exactly what I could put in
debug... if I put the full org.apache logger in DEBUG it's nightmare :)

Let me know if I can help, and thanks for replying so quickly

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to