On 02/21/11 10:21, Mark Thomas wrote:
> The ASF Sonar installation managed to generate 46GB of identical log
> messages [1] today in the 8 hours it took to notice it was down.

Continuing to drive down the cost of disk storage :-)

> While better monitoring would/should have identified the problem sooner,
> it does demonstrate a problem with the acceptor threads in all three
> endpoints. If there is a system-level issue that causes the accept()
> call to always fail (such as hitting the ulimit) then the endpoint
> essentially enters a loop where it logs an error message for every
> iteration of the loop. This will result in many log messages per second.
> 
> I'd like to do something about this. I was thinking of something along
> the lines of the following for each endpoint.
> 
> Index: java/org/apache/tomcat/util/net/JIoEndpoint.java
> ===================================================================
> --- java/org/apache/tomcat/util/net/JIoEndpoint.java  (revision 1072939)
> +++ java/org/apache/tomcat/util/net/JIoEndpoint.java  (working copy)
> @@ -183,9 +183,19 @@
>          @Override
>          public void run() {
> 
> +            int errorDelay = 0;
> +
>              // Loop until we receive a shutdown command
>              while (running) {
> 
> +                if (errorDelay > 0) {
> +                    try {
> +                        Thread.sleep(errorDelay);
> +                    } catch (InterruptedException e) {
> +                        // Ignore
> +                    }
> +                }
> +
>                  // Loop if endpoint is paused
>                  while (paused && running) {
>                      try {
> @@ -225,9 +235,15 @@
>                              // Ignore
>                          }
>                      }
> +                    errorDelay = 0;
>                  } catch (IOException x) {
>                      if (running) {
>                          log.error(sm.getString("endpoint.accept.fail"), x);
> +                        if (errorDelay == 0) {
> +                            errorDelay = 50;
> +                        } else if (errorDelay < 1600) {
> +                            errorDelay = errorDelay * 2;
> +                        }
>                      }
>                  } catch (NullPointerException npe) {
>                      if (running) {
> 
> 
> 
> Thoughts / comments?

+1 - a bit of smarts in reducing redundant logging is usually a good thing.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to