On Tue, Apr 3, 2018 at 11:19 AM Miroslav Lichvar <mlich...@redhat.com> wrote: > > I came across an interesting issue with error messages in sockets with > enabled timestamping using the SOF_TIMESTAMPING_OPT_CMSG option. When > the socket is connected and there is an error (e.g. due to destination > unreachable ICMP), select() indicates there is an exception on the > socket, but recvmsg() reading from the error queue returns with EAGAIN > and the application gets stuck in an infinite loop. > > Some observations: > - it happens on both AF_INET and AF_INET6 SOCK_DGRAM sockets > - enabling the IP_RECVERR option avoids getting EAGAIN > - using recvmmsg() instead of recvmsg() avoids getting EAGAIN > (that is why I didn't notice it earlier) > - disabling TX timestamping doesn't prevent the socket from having an > exception > - reading from the non-error queue stops the loop > > Is this a bug?
POLLERR and select exceptions on an FD indicate that either (a) there is a message on the error queue, or (b) sk_err is set. Because of (b), it's not sufficient to only call recvmsg(MSG_ERRQUEUE). For TCP, we must always read or write from the socket after POLLERR. For UDP, getsockopt(SO_ERROR) would return the actual socket error and will clear the sk_err value. recvmmsg had a bug and was checking sk_err when we read from error queue, before actually reading the error queue. That was really a bug in recvmmsg which should be fixed after: https://patchwork.ozlabs.org/patch/878861/ > It looks to me like SOF_TIMESTAMPING_OPT_CMSG implicitly, but only > partially, enables IP_RECVERR. Are applications required to use > IP_RECVERR in this case? My expectation was that without IP_RECVERR > the error queue would only have messages with transmit timestamps, and > nothing would change with reporting of real errors. No, IP_RECVERR and SOF_TIMESTAMPING_OPT_CMSG are completely orthogonal. When we have IP_RECVERR, the ICMP packet is simply added to the error queue. Without IP_RECVERR, an ICMP error packet results in setting the sk_err and the ICMP packet is discarded. This behavior is completely unrelated to SOF_TIMESTAMPING_OPT_CMSG, and sk_err is always set in response to ICMP packets regardless of transmit timestamps. Since you're only checking the error queue upon a socket error/exception, you will need to set IP_RECVERR to make sure the ICMP packet is kept on the error queue. You wouldn't need that if you also check getsockopt(SO_ERROR). > Also, from the > documentation I had an impression that SOF_TIMESTAMPING_OPT_CMSG is a > no-op on AF_INET6 sockets. No, it should work for both v4 and v6. Thanks, Soheil > -- > Miroslav Lichvar