From: Luke Hsiao <lukehs...@google.com>

Currently, io_uring's recvmsg subscribes to both POLLERR and POLLIN. In
the context of TCP tx zero-copy, this is inefficient since we are only
reading the error queue and not using recvmsg to read POLLIN responses.

This patch was tested by using a simple sending program to call recvmsg
using io_uring with MSG_ERRQUEUE set and verifying with printks that the
POLLIN is correctly unset when the msg flags are MSG_ERRQUEUE.

Signed-off-by: Arjun Roy <arjun...@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soh...@google.com>
Acked-by: Eric Dumazet <eduma...@google.com>
Reviewed-by: Jens Axboe <ax...@kernel.dk>
Signed-off-by: Luke Hsiao <lukehs...@google.com>
---
 fs/io_uring.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index dc506b75659c..1aa2191ea683 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4932,6 +4932,12 @@ static bool io_arm_poll_handler(struct io_kiocb *req)
                mask |= POLLIN | POLLRDNORM;
        if (def->pollout)
                mask |= POLLOUT | POLLWRNORM;
+
+       /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
+       if ((req->opcode == IORING_OP_RECVMSG) &&
+           (req->sr_msg.msg_flags & MSG_ERRQUEUE))
+               mask &= ~POLLIN;
+
        mask |= POLLERR | POLLPRI;
 
        ipt.pt._qproc = io_async_queue_proc;
-- 
2.28.0.297.g1956fa8f8d-goog

Reply via email to