From: Luke Hsiao <lukehs...@google.com> Currently, io_uring's recvmsg subscribes to both POLLERR and POLLIN. In the context of TCP tx zero-copy, this is inefficient since we are only reading the error queue and not using recvmsg to read POLLIN responses.
This patch was tested by using a simple sending program to call recvmsg using io_uring with MSG_ERRQUEUE set and verifying with printks that the POLLIN is correctly unset when the msg flags are MSG_ERRQUEUE. Signed-off-by: Arjun Roy <arjun...@google.com> Signed-off-by: Soheil Hassas Yeganeh <soh...@google.com> Acked-by: Eric Dumazet <eduma...@google.com> Reviewed-by: Jens Axboe <ax...@kernel.dk> Signed-off-by: Luke Hsiao <lukehs...@google.com> --- fs/io_uring.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/io_uring.c b/fs/io_uring.c index dc506b75659c..1aa2191ea683 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -4932,6 +4932,12 @@ static bool io_arm_poll_handler(struct io_kiocb *req) mask |= POLLIN | POLLRDNORM; if (def->pollout) mask |= POLLOUT | POLLWRNORM; + + /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */ + if ((req->opcode == IORING_OP_RECVMSG) && + (req->sr_msg.msg_flags & MSG_ERRQUEUE)) + mask &= ~POLLIN; + mask |= POLLERR | POLLPRI; ipt.pt._qproc = io_async_queue_proc; -- 2.28.0.297.g1956fa8f8d-goog