Hi!,
No... nothing to do with that mate...
My problem was related to non spawning new processes when needed for
handling mainly some little peaks of load...
I have changed the script to behave as Bron recommended with cyr_info
and will check it... there's a big difference between cyr_info proc
output pids for killing sometimes and the procceses that in the kill
moment are in READY state and unused for 10 minutes or whatever time and
with the old process title...
So let's check it...
Cheers!
El 2024-10-02 11:34, Дилян Палаузов escribió:
Hello,
Sometime ago, I wrote that when we send TERM to imapd (although it
would happen with any other I assume, pop, sieve...) procceses we
wanted to exit,
When and where have you written this?
I observe that sometimes stopping master, by sending it SIGTERM,
triggers an endless loop there waiting for some child to terminate -
https://cyrus.topicbox.com/groups/devel/Tea1dbf8f58b01177-Mec77eb0e4f3ec3641e7e0191/the-master-janitor-goes-crazy-re-debugging-deadlocks
. This leads to 100% CPU utilization because of that loop. It happens
repeatedly, but I do not know how to reproduce this.
I thought https://github.com/cyrusimap/cyrus-imapd/pull/3449 is a fix,
but it is not.
Does your proposal approach the problem that on SIGTERM the master
process enters endless loop?
Greetings
Дилян
-----Original Message-----
From: egoitz via Devel <devel@cyrus.topicbox.com>
Reply-To: Devel <devel@cyrus.topicbox.com>
To: devel@cyrus.topicbox.com
Subject: Services in READY state and MAX_READY_FAILS in less than
MAX_READY_FAIL_INTERVAL
Date: 10/09/24 19:55:04
Hi!,
Sometime ago, I wrote that when we send TERM to imapd (although it
would happen with any other I assume, pop, sieve...) procceses we
wanted to exit, due to a user request for disconnecting his/her
sessions, sometimes happened that was like, after that sessions
disconnection (TERM to imapd processes) no enough processes where
become spawned newly. Only sometimes when very few processes needed to
be killed.
I have been able to reproduce it. If a user has connected (because
proctitle() has set it in the name) and later in very few time "leaves"
(logouts for instance) and then the process moves to READY state if you
kill with TERM more than MAX_READY_FAILS units of that process in less
than MAX_READY_FAIL_INTERVAL, master won't spawn new processes as it's
written in master.c in lines near 1100 in reap_child() function.
It's suggested to launch a SIGHUP to master for activating again the
service, but it can't be enabled again because the service seems to
have removed from the s data structure but not stopped. Due to that non
process stop, when new imapd attemps to load (in service_create() ) it
can't be created because the socket is still in use.
So, for ensuring this is correct, I have written the following patch
for master.c and that I tested on 3.0.15 :
root@debugcyrus:/usr/ports/mail/cyrus-imapd30 # diff -u
work/cyrus-imapd-3.0.15/master/master.c /master.c-definitivo
--- work/cyrus-imapd-3.0.15/master/master.c 2021-03-09
04:27:45.000000000 +0100
+++ /master.c-definitivo 2024-09-10 18:36:49.797581000 +0200
@@ -129,6 +129,11 @@
};
static int verbose = 0;
+
+/* RESET MAX_READY_FAILS OF SERVICE IN MASTER WRAPPER */
+static int gotsigurg = 0;
+/* RESET MAX_READY_FAILS OF SERVICE IN MASTER WRAPPER */
+
static int listen_queue_backlog = 32;
static int pidfd = -1;
@@ -1047,6 +1052,22 @@
}
}
+/* RESET MAX_READY_FAILS OF SERVICE IN MASTER WRAPPER */
+static void sigurg_handler(int sig __attribute__((unused)))
+{
+ syslog(LOG_DEBUG, "URG CAPTURADO!!!!!!");
+
+ if (gotsigurg)
+ {
+ gotsigurg = 0;
+ }
+ else
+ {
+ gotsigurg = 1;
+ }
+}
+/* RESET MAX_READY_FAILS OF SERVICE IN MASTER WRAPPER */
+
static void reap_child(void)
{
int status;
@@ -1094,10 +1115,24 @@
"terminated abnormally",
SERVICEPARAM(s->name),
SERVICEPARAM(s->familyname), pid);
- if (now - s->lastreadyfail >
MAX_READY_FAIL_INTERVAL) {
+
+ syslog(LOG_DEBUG, "Senal URG vale....
--%d--",gotsigurg);
+
+ if ((now - s->lastreadyfail >
MAX_READY_FAIL_INTERVAL) || (gotsigurg))
+ {
s->nreadyfails = 0;
+
+ if (gotsigurg)
+ {
+ syslog(LOG_DEBUG, "RESETEANDO....");
+ syslog(LOG_DEBUG, "RESETEADO....");
+ }
}
+
+ syslog(LOG_DEBUG, "too many failures for
service %s/%s, resetting counters due to SIGURG received in Cyrus
master. El got vale
--%d--",SERVICEPARAM(s->name),SERVICEPARAM(s->familyname),gotsigurg);
+
s->lastreadyfail = now;
+
if (++s->nreadyfails >= MAX_READY_FAILS && s->exec) {
syslog(LOG_ERR, "too many failures for "
"service %s/%s, disabling until next SIGHUP",
@@ -1305,11 +1340,18 @@
sigemptyset(&action.sa_mask);
action.sa_handler = sighup_handler;
+
#ifdef SA_RESTART
action.sa_flags |= SA_RESTART;
#endif
if (sigaction(SIGHUP, &action, NULL) < 0)
fatalf(1, "unable to install signal handler for SIGHUP: %m");
+
+ /* RESET MAX_READY_FAILS OF SERVICE IN MASTER WRAPPER */
+ action.sa_handler = sigurg_handler;
+ if (sigaction(SIGURG, &action, NULL) < 0)
+ fatalf(1, "unable to install signal handler for SIGURG: %m");
+ /* RESET MAX_READY_FAILS OF SERVICE IN MASTER WRAPPER */
action.sa_handler = sigalrm_handler;
if (sigaction(SIGALRM, &action, NULL) < 0)
root@debugcyrus:/usr/ports/mail/cyrus-imapd30 #
So by what I have seen Cyrus wrapper is written (and the way it handles
services), I think that the possible solutions could be :
* Send a kill(pid,SIGTERM) for ensuring the process die before
forgetting from s structure.
* Do something similar as I have done, which gives you a time window
for having more failures than expected for some seconds and which later
could be undone with for instance the same signal sending.
Reproduced and something proposed at least.
What do you think about it? :)
Cheers!
sarenet
Egoitz Aurrekoetxea
Departamento de sistemas
94 - 420 94 70
ego...@sarenet.es
www.sarenet.es [1]
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia)
servicios sarenet
Antes de imprimir este correo electrónico piense si es necesario
hacerlo.
Cyrus / Devel / seediscussions +participants +delivery options
Permalink
Links:
------
[1] http://www.sarenet.es
Egoitz Aurrekoetxea
Departamento de sistemas
94 - 420 94 70 | ego...@sarenet.es
S A R E N E T S.A.U.
Parque Tecnológico. Edificio 103 | 48170 Zamudio (Bizkaia) - www.sarenet.es
Antes de imprimir este correo electrónico piense si es necesario hacerlo.
------------------------------------------
Cyrus: Devel
Permalink:
https://cyrus.topicbox.com/groups/devel/Tf9f7cf579fff1397-Md18a5bb55802a5958b7f5534
Delivery options: https://cyrus.topicbox.com/groups/devel/subscription