common: jk_ajp_common.c jk_lb_worker.c jk_shm.h

Rainer Jung Sat, 07 Mar 2009 03:14:10 -0800

On 07.03.2009 09:00, mt...@apache.org wrote:

Author: mturk
Date: Sat Mar  7 08:00:54 2009
New Revision: 751217


URL: http://svn.apache.org/viewvc?rev=751217&view=rev
Log:
If the number of the channels in error more then half of the busy channels
mark the worker global status as error

Modified:
     tomcat/connectors/trunk/jk/native/common/jk_ajp_common.c
     tomcat/connectors/trunk/jk/native/common/jk_lb_worker.c
     tomcat/connectors/trunk/jk/native/common/jk_shm.h

Modified: tomcat/connectors/trunk/jk/native/common/jk_ajp_common.c
URL: 
http://svn.apache.org/viewvc/tomcat/connectors/trunk/jk/native/common/jk_ajp_common.c?rev=751217&r1=751216&r2=751217&view=diff
==============================================================================
--- tomcat/connectors/trunk/jk/native/common/jk_ajp_common.c (original)
+++ tomcat/connectors/trunk/jk/native/common/jk_ajp_common.c Sat Mar  7 
08:00:54 2009
@@ -2120,6 +2120,8 @@
      aw->s->transferred += e->wr;
      if (aw->s->busy)
          aw->s->busy--;
+    if (aw->s->in_error)
+        aw->s->in_error--;
      if (rc == JK_TRUE) {
          aw->s->state = JK_AJP_STATE_OK;
      }
@@ -2130,6 +2132,7 @@
      else {
          aw->s->state = JK_AJP_STATE_ERROR;
          aw->s->errors++;
+        aw->s->in_error++;
          aw->s->error_time = time(NULL);
      }
  }

I think this can't possibly work. We decrement once for each endedrequest (error or not) and we increment after the request if it returnedwith an error.


So if

- the value is zero and we have an error we end up with value 1
- the value is one and we have no error we end up with zero

- the value is one and we have another error we end up again with1-1+1=1 (not 2).


Therefore we will never exceed the value 1 as in_error here.

Plan B: Counting error excess

We could change it to decrement only in the cases JK_TRUE orJK_CLIENT_ERROR, but then we would count the excess of errors over OKrequests. So whenever more OKs follow than errors, the counter would goto 0 again. The problem with this counter is, that I can not see anygood criterion how to decide about the global node state. The excesscould only be related to the load, e.g. relative to how many requestsper time we were handling recently. That's another number we don't have.


Something along:

- add a special request counter x
- handle in_errors as described in B)
- reset x to zero whenever the in_errors is zero
- increment x by one for each request as long as in_errors is positive

- in lb "else" choose global error if in_error is bigger than N% of x(e.g. N=10 or N=50). But wait, we need some correction for smallin_errors, like in_errors=x=1.


Plan C: Counting error endpoints (approximation of busy errors)

Each endpoint could remember whether it last had an error or not. Thenafter the request it would


- increment in_errors, if it went from OK to error
- decrement in_errors, if it went from error to OK
- keep in_errors same otherwise

But still, since there is no fair usage distribution over the endpoints,this will not give a useful number (lots of OK requests could use thesame endpoint and all the error requests could be distributed over manydifferent endpoints or vice versa).

The problem comes from the fact, that busy is a snapshot number, andthere is no way to tell, how many of the requests being on the fly, willreturn with an error.


Summary:

I still like the idea of using the error_time. Each OK request willreset it, and that's fine. As long as there's something good coming backwe have a global chance. But if there are no OK's for some time weshould switch to global ERROR.

After 10 seconds or after 60 seconds: I think 60 seconds is pretty long,but I would accept as a compromise :)


Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Re: svn commit: r751217 - in /tomcat/connectors/trunk/jk/native/common: jk_ajp_common.c jk_lb_worker.c jk_shm.h

Reply via email to