Hi,
> On 22. Jan 2024, at 20.00, Achim Stahlberger <[email protected]>
> wrote:
>
> Hello,
>
> we see this problem when doveadm-server is under load.
>
> We enabled doveadm http api with this config:
>
> service doveadm {
> inet_listener {
> port = 50000
> }
> inet_listener http {
> port = 50001
> ssl = no
> }
> }
>
> When doveadm server receives at lot of connects on port 50000 the http
> service on
> port 50001 is not responding until load on port 50000 drops to zero.
> It seems like doveadm-server is prefering port 50000 over 50001.
>
> Looking into Recv-Q when http hangs
>
> Netid State Recv-Q Send-Q Local
> Address:Port Peer Address:Port
> tcp LISTEN 129 128
> 0.0.0.0:50000 0.0.0.0:*
> tcp LISTEN 1 128
> 0.0.0.0:50001 0.0.0.0:*
By "drops to zero" do you mean the connection queue has to drain until Recv-Q
is below 129? Or that even then it needs to go down further?
Also, doveadm process isn't supposed to be handling more than one client at a
time. So if it has a lot of pressure, why does it matter which port it's
answering to since it can't handle all connections anyway?
> I used this script to produce load on port 50000
>
> #!/bin/bash
> for i in {1..1000}
> do
> echo "124" | netcat localhost 50000 &
> done
> wait
>
> When this script is started it takes several seconds until a connect on port
> 50001 succeeds.
..
> I think the problem is in src/lib/ioloop-epoll.c function
> io_loop_handler_run_internal.
> This might fix the problem (look for new variable rr):
With the fix wouldn't it still take several seconds to connect to either 50000
or 50001, since now both the queues are full? Or why is it different?
Below is a bit simpler patch, which I think does the same as yours:
diff --git a/src/lib/ioloop-epoll.c b/src/lib/ioloop-epoll.c
index ad4100865f..4379680a6e 100644
--- a/src/lib/ioloop-epoll.c
+++ b/src/lib/ioloop-epoll.c
@@ -170,7 +170,7 @@ void io_loop_handler_run_internal(struct ioloop *ioloop)
struct io_file *io;
struct timeval tv;
unsigned int events_count;
- int msecs, ret, i, j;
+ int msecs, ret, i, j, i_start;
bool call;
i_assert(ctx != NULL);
@@ -197,10 +197,11 @@ void io_loop_handler_run_internal(struct ioloop *ioloop)
if (!ioloop->running)
return;
+ i_start = ret <= 1 ? 0 : i_rand_limit(ret);
for (i = 0; i < ret; i++) {
/* io_loop_handle_add() may cause events array reallocation,
so we have use array_idx() */
- event = array_idx(&ctx->events, i);
+ event = array_idx(&ctx->events, (i_start + i) % ret);
list = event->data.ptr;
for (j = 0; j < IOLOOP_IOLIST_IOS_PER_FD; j++) {
Although it's using quite a lot of randomness (= /dev/urandom reads), which
isn't so good. I think it would be just as good to do round-robin:
diff --git a/src/lib/ioloop-epoll.c b/src/lib/ioloop-epoll.c
index ad4100865f..80a5e67cee 100644
--- a/src/lib/ioloop-epoll.c
+++ b/src/lib/ioloop-epoll.c
@@ -197,10 +197,12 @@ void io_loop_handler_run_internal(struct ioloop *ioloop)
if (!ioloop->running)
return;
+ static int i_start = 0;
+ i_start++;
for (i = 0; i < ret; i++) {
/* io_loop_handle_add() may cause events array reallocation,
so we have use array_idx() */
- event = array_idx(&ctx->events, i);
+ event = array_idx(&ctx->events, (i_start + i) % ret);
list = event->data.ptr;
for (j = 0; j < IOLOOP_IOLIST_IOS_PER_FD; j++) {
But even so, I'd like to understand better what exactly this is helping with
before merging. Looking at libevent and nginx, I don't see them doing anything
like this either.
_______________________________________________
dovecot mailing list -- [email protected]
To unsubscribe send an email to [email protected]