https://bz.apache.org/bugzilla/show_bug.cgi?id=63859
--- Comment #3 from Aurelien Pernoud <aurel...@pernoud.org> --- Hi Rainer, Any info on usage characteristics during times this happens? High load (how many requests per second), with or without load balancing, workers.properties config etc.? => There is no "rule", it even happen with only a few users on my test instance. Workers.properties is the same than for my tomcat 7&8 instances, e.g. I use a servertemplate which I apply to node and a lbtemplate too. Here is an extract : #Create one common template for all workers nodes worker.servertemplate.type=ajp13 # factor of each worker is the same worker.servertemplate.lbfactor=1 # ping_mode A is the most complete worker.servertemplate.ping_mode=C # socket_timeout in seconds worker.servertemplate.socket_timeout=30 # connection_pool_timeout in seconds worker.servertemplate.connection_pool_timeout=600 # reply_timeout in milliseconds : 10 min worker.servertemplate.reply_timeout=600000 # recovery_options : 3 : don't retry on error after request was sent worker.servertemplate.recovery_options=3 # Create one common template for all LB nodes worker.balancertemplate.type=lb # If we have 10 replies timing out in the same minute => worker goes in error state worker.balancertemplate.max_reply_timeouts=10 Then based on this I apply those settings to nodes / clusters. How frequently does it happen (always, sporadicaly for NNN % of requests, in spikes, ...)? => I couldn't find any "rule" sorry.... but it happened at least 10 times / day and since I changed the cping cpong to C it doesn't show anymore in the logs even though I have activity Can you easily reproduce? => Yes, I have a test environment which is in use and if I push back cping/cpong mode to A I'm sure the error will show up again. Do you have root privileges, so could you sniff network traffic to the AJP port? Are Apache and Tomcat on diferent machines? Any active components (Firewalls, routers) in between? => I'm not root unfortunately, but my config is that I run 2 linux servers, which both hosts httpd and tomcat instances (both servers runs same versions of httpd, tomcat), and it works fine with the ajp port of tomcat 7&8. . There is no firewall between them (hard or soft), and the error occurs even when the httpd is connecting to "himself" (even though I don't use "localhost" but the hostname) I've ran this setup for more than a year on Tomcat 7&8 (even with upgrades), and only met the issue with 9. It failes with 9.0.20 so I tried to upgrade last week to 9.0.26 but still here. Since yesterday and switching the ping_mode to "C" it stopped failing, so for now I will go like that in production but might be good to investigate. Let me know if I can be of any help. FYI I tried : - putting mod_jk in debug => way to verbose (I run 9 instances of tomcat in the end on the same server), but at the beginning the behaviour looks correct (I saw the ping pong working with my tomcat 9 instances every minute) - adding debug in tomcat "AJP" : couldn't find exactly what I could put in debug... if I put the full org.apache logger in DEBUG it's nightmare :) Let me know if I can help, and thanks for replying so quickly -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org