Thank you for the detailed explanation Fabio, the impact of the bug is now much better understood. Can I ask for an update of the Impact field in the SRU description above? In a summary, but also getting into a bit more details about how this differs from focal+ etc.
One thing I worry about in this SRU is basically what has been mentioned by Ben Hutchings as a follow up to the patch forwarded upstream: https://lists.zytor.com/archives/klibc/2021-December/004635.html I worry that since this issue in ipconfig has been around for *so long* that people could have actually started depending on the behavior being as-is. I don't think klibc's ipconfig is too widely used (as klibc-utils only has like 2 reverse-dependencies), but it's still something to consider. I think I'm not generally against this SRU. Similar to Robie I would like some regression potential analysis done and included in the [Where problems could occur] section - one of them being what I mentioned above. Can you think of any other issues this change could cause? In integration or in different parts of the code? Can we mitigate this somehow by performing some additional testing? If you could modify the two sections of the SRU description, I'd be ready to accept this into bionic-proposed. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to klibc in Ubuntu. https://bugs.launchpad.net/bugs/1947099 Title: ipconfig does not honour user-requested timeouts in some cases Status in klibc package in Ubuntu: New Status in klibc source package in Bionic: Incomplete Bug description: [Impact] In some cases, ipconfig can take a longer time than the user-specified timeouts, causing unexpected delays. [Test Plan] Any situation where ipconfig encounters an error sending the DHCP packet, it will automatically set a delay of 10 seconds, which could be longer than the user-specified timeout. It can be reproduced by creating a dummy interface and attempting to run ipconfig on it with a timeout value of less than 10: # ip link add eth1 type dummy # date; /usr/lib/klibc/bin/ipconfig -t 2 eth1; date Thu Nov 18 04:46:13 EST 2021 IP-Config: eth1 hardware address ae:e0:f5:9d:7e:00 mtu 1500 DHCP RARP IP-Config: no response after 2 secs - giving up Thu Nov 18 04:46:23 EST 2021 ^ Notice above, ipconfig thinks that it waited 2 seconds, but the timestamps show an actual delay of 10 seconds. [Where problems could occur] Please see reproduction steps above. We are seeing this in production too (see comment #2). [Other Info] A patch to fix the issue is being proposed here. It is a safe fix - it only checks before going into sleep that the timeout never exceeds the user-requested value. [Original Description] In some cases, ipconfig can take longer than the user-specified timeouts, causing unexpected delays. in main.c, in function loop(), the process can go into process_timeout_event() (or process_receive_event() ) and if it encounters an error situation, will set an attempt to "try again later" at time equal now + 10 seconds by setting s->expire = now + 10; This can happen at any time during the main event loop, which can end up extending the user-specified timeout if "now + 10" is greater than "start_time + user-specified-timeout". I believe a patch like the following is needed to avoid this problem: --- a/usr/kinit/ipconfig/main.c +++ b/usr/kinit/ipconfig/main.c @@ -437,6 +437,13 @@ static int loop(void) if (timeout > s->expire - now.tv_sec) timeout = s->expire - now.tv_sec; + + /* Compensate for already-lost time */ + gettimeofday(&now, NULL); + if (now.tv_sec + timeout > start + loop_timeout) { + timeout = loop_timeout - (now.tv_sec - start); + printf("Lowered timeout to match user request = (%d s) \n", timeout); + } } I believe the current behaviour is buggy. This is confirmed when the following line is executed: if (loop_timeout >= 0 && now.tv_sec - start >= loop_timeout) { printf("IP-Config: no response after %d " "secs - giving up\n", loop_timeout); rc = -1; goto bail; } 'loop_timeout' is the user-specified time-out. With a value of 2, in case of error, this line prints: IP-Config: no response after 2 secs - giving up So it thinks that it waited 2 seconds - however, in reality it had actually waited for 10 seconds. The suggested code-change ensures that the timeout that is actually used never exceeds the user-specified timeout. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/klibc/+bug/1947099/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp